Idempotent API Design Low-Level Design: Idempotency Keys, Request Deduplication, and Expiry

What Is Idempotent API Design?

An idempotent API guarantees that executing the same request multiple times produces the same side-effect as executing it once. This property is critical for payment processing, order submission, resource creation, and any mutation that must survive network retries without duplicating work. The low-level design centers on three concerns: how clients supply identity for a request, how the server detects and deduplicates that request, and how long cached responses are retained.

Requirements

Functional Requirements

  • Clients attach a unique idempotency key to every mutating request via an HTTP header (Idempotency-Key).
  • On the first call the server executes the operation and stores the response.
  • On any subsequent call with the same key the server returns the stored response without re-executing.
  • Keys expire after a configurable TTL (default 24 hours) and are then eligible for re-use.
  • Conflicting payloads for the same key within the TTL window return a 409 Conflict.

Non-Functional Requirements

  • Deduplication lookup must add less than 5 ms of latency on cache hit.
  • The idempotency store must survive node restarts; data loss would allow double processing.
  • Throughput must scale horizontally without cross-node coordination on the hot path.

Data Model

The idempotency store holds one record per (client_id, idempotency_key) pair:

  • key_hash — SHA-256 of client_id + raw key; serves as the primary lookup field.
  • request_fingerprint — hash of HTTP method, path, and canonical request body; used for conflict detection.
  • status — ENUM: IN_FLIGHT, COMPLETE, FAILED.
  • response_status — stored HTTP status code (e.g., 201, 400).
  • response_body — serialized response payload, compressed with LZ4 for large responses.
  • created_at, expires_at — timestamps for TTL enforcement.

Redis is the natural fit for this store: atomic operations, native TTL support, and sub-millisecond reads. For durable scenarios (payments), back the Redis cache with a relational table and use Redis as the read-through layer.

Core Algorithm: Request Lifecycle

Step 1 — Key Extraction and Validation

Parse the Idempotency-Key header. Reject missing or malformed keys with 400. Normalize to lowercase and enforce a length cap (128 chars) to prevent abuse.

Step 2 — Atomic Lock and Lookup

Use a Redis SET key value NX PX ttl_ms to attempt an atomic insert. If the key already exists, fetch the stored record. A Lua script combines the lookup and conditional insert into one round-trip, eliminating the race between check and set:

  • If status is COMPLETE or FAILED: return stored response immediately.
  • If status is IN_FLIGHT: return 202 or 409 depending on policy (retry-after semantics).
  • If fingerprint mismatch: return 409 Conflict with a descriptive body.

Step 3 — Execution and Response Storage

Execute the underlying business logic. On completion, write status, response_status, and response_body to the store. Update status to COMPLETE or FAILED atomically. Release the in-flight lock.

Step 4 — TTL Expiry

Redis TTL handles expiry automatically. For the relational backing store, run a nightly cleanup job that deletes rows where expires_at < NOW(). Partition the table by expires_at date for efficient range deletes.

API Design

  • POST /payments with Idempotency-Key: uuid-v4 — first call returns 201 Created; replay returns 201 with identical body.
  • 409 Conflict — payload changed for same key; safe for clients to abort.
  • 202 Accepted — operation still in flight; client should retry with backoff.
  • Headers returned: X-Idempotency-Replay: true on cached responses, aiding client debugging.

Scalability Considerations

Redis Cluster shards keys by hash slot, distributing load across nodes. Keep idempotency records in a dedicated logical database or keyspace prefix to avoid eviction by unrelated data. For global deployments, replicate the idempotency store to read replicas and funnel writes to the primary. If Redis latency is unacceptable at extreme scale, a distributed lock service (etcd, ZooKeeper) can manage the in-flight lock while a local cache absorbs completed lookups. Monitor the hit ratio: a high replay rate may indicate client retry storms that need backoff tuning upstream.

Edge Cases and Failure Modes

  • Server crash after execution, before store write: operation succeeds but idempotency record is lost; next retry re-executes. Mitigate by writing the record inside the same database transaction as the business mutation.
  • Client sends different key per retry: each retry creates a new record; educate clients via documentation and SDK enforcement.
  • Very large response bodies: cap stored payload at 64 KB; for larger responses store a reference (S3 key) and redirect replays.

Summary

Idempotent API design requires atomic key locking, fingerprint-based conflict detection, durable response caching, and disciplined TTL management. A Redis-backed store with Lua scripting handles the common case efficiently, while a relational backing layer provides durability for high-value mutations. Combined with clear HTTP semantics and client-facing replay headers, this design makes distributed retries safe by default.

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top