Idempotent API Design Low-Level Design: Idempotency Keys, Request Deduplication, and Expiry

What Is Idempotent API Design?

An idempotent API guarantees that executing the same request multiple times produces the same side-effect as executing it once. This property is critical for payment processing, order submission, resource creation, and any mutation that must survive network retries without duplicating work. The low-level design centers on three concerns: how clients supply identity for a request, how the server detects and deduplicates that request, and how long cached responses are retained.

Requirements

Functional Requirements

Clients attach a unique idempotency key to every mutating request via an HTTP header (Idempotency-Key).
On the first call the server executes the operation and stores the response.
On any subsequent call with the same key the server returns the stored response without re-executing.
Keys expire after a configurable TTL (default 24 hours) and are then eligible for re-use.
Conflicting payloads for the same key within the TTL window return a 409 Conflict.

Non-Functional Requirements

Deduplication lookup must add less than 5 ms of latency on cache hit.
The idempotency store must survive node restarts; data loss would allow double processing.
Throughput must scale horizontally without cross-node coordination on the hot path.

Data Model

The idempotency store holds one record per (client_id, idempotency_key) pair:

key_hash — SHA-256 of client_id + raw key; serves as the primary lookup field.
request_fingerprint — hash of HTTP method, path, and canonical request body; used for conflict detection.
status — ENUM: IN_FLIGHT, COMPLETE, FAILED.
response_status — stored HTTP status code (e.g., 201, 400).
response_body — serialized response payload, compressed with LZ4 for large responses.
created_at, expires_at — timestamps for TTL enforcement.

Redis is the natural fit for this store: atomic operations, native TTL support, and sub-millisecond reads. For durable scenarios (payments), back the Redis cache with a relational table and use Redis as the read-through layer.

Core Algorithm: Request Lifecycle

Step 1 — Key Extraction and Validation

Parse the Idempotency-Key header. Reject missing or malformed keys with 400. Normalize to lowercase and enforce a length cap (128 chars) to prevent abuse.

Step 2 — Atomic Lock and Lookup

Use a Redis SET key value NX PX ttl_ms to attempt an atomic insert. If the key already exists, fetch the stored record. A Lua script combines the lookup and conditional insert into one round-trip, eliminating the race between check and set:

If status is COMPLETE or FAILED: return stored response immediately.
If status is IN_FLIGHT: return 202 or 409 depending on policy (retry-after semantics).
If fingerprint mismatch: return 409 Conflict with a descriptive body.

Step 3 — Execution and Response Storage

Execute the underlying business logic. On completion, write status, response_status, and response_body to the store. Update status to COMPLETE or FAILED atomically. Release the in-flight lock.

Step 4 — TTL Expiry

Redis TTL handles expiry automatically. For the relational backing store, run a nightly cleanup job that deletes rows where expires_at < NOW(). Partition the table by expires_at date for efficient range deletes.

API Design

POST /payments with Idempotency-Key: uuid-v4 — first call returns 201 Created; replay returns 201 with identical body.
409 Conflict — payload changed for same key; safe for clients to abort.
202 Accepted — operation still in flight; client should retry with backoff.
Headers returned: X-Idempotency-Replay: true on cached responses, aiding client debugging.

Scalability Considerations

Redis Cluster shards keys by hash slot, distributing load across nodes. Keep idempotency records in a dedicated logical database or keyspace prefix to avoid eviction by unrelated data. For global deployments, replicate the idempotency store to read replicas and funnel writes to the primary. If Redis latency is unacceptable at extreme scale, a distributed lock service (etcd, ZooKeeper) can manage the in-flight lock while a local cache absorbs completed lookups. Monitor the hit ratio: a high replay rate may indicate client retry storms that need backoff tuning upstream.

Edge Cases and Failure Modes

Server crash after execution, before store write: operation succeeds but idempotency record is lost; next retry re-executes. Mitigate by writing the record inside the same database transaction as the business mutation.
Client sends different key per retry: each retry creates a new record; educate clients via documentation and SDK enforcement.
Very large response bodies: cap stored payload at 64 KB; for larger responses store a reference (S3 key) and redirect replays.

Summary

Idempotent API design requires atomic key locking, fingerprint-based conflict detection, durable response caching, and disciplined TTL management. A Redis-backed store with Lua scripting handles the common case efficiently, while a relational backing layer provides durability for high-value mutations. Combined with clear HTTP semantics and client-facing replay headers, this design makes distributed retries safe by default.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the lifecycle of an idempotency key?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A client generates a unique idempotency key (typically a UUID v4) and attaches it to the request. The server stores the key alongside the response in a durable store on first execution. Subsequent requests with the same key return the cached response without re-executing business logic. The key is retained for a TTL window (commonly 24 hours) after which it expires and the slot can be reused.”
}
},
{
“@type”: “Question”,
“name”: “How does Redis NX provide an atomic lock for idempotency?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “SET key value NX PX ttl atomically writes the key only if it does not exist. The first request wins the lock; concurrent requests with the same idempotency key get a nil return and must wait or return a 409 Conflict. Because SET NX is a single atomic command, no additional Lua scripting is needed to prevent race conditions on key creation.”
}
},
{
“@type”: “Question”,
“name”: “How do you detect and handle fingerprint conflicts in an idempotent API?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A fingerprint is a hash of the request body and critical headers stored alongside the idempotency key. On replay, the server hashes the incoming request and compares it to the stored fingerprint. A mismatch (same key, different payload) is a client error and should return 422 Unprocessable Entity, preventing silent data corruption from accidental key reuse with a different payload.”
}
},
{
“@type”: “Question”,
“name”: “What TTL strategy and cleanup approach should an idempotent API use?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Set the TTL to match the maximum retry window clients are expected to use — 24 hours is a common default. In Redis, use the PX option on SET so expiry is handled automatically. For a relational store, run a periodic background job or a database-level scheduled event to DELETE rows where created_at < NOW() – INTERVAL. Log expirations to detect abnormal retry patterns.”
}
}
]
}