Idempotency Key Service Low-Level Design: Deduplication, Atomic Claims, and Concurrent Retries

An idempotency key service ensures that retried API requests do not cause duplicate side effects — a critical requirement for payment APIs, order placement, and any operation that mutates state. Calling POST /charge twice with the same idempotency key must produce exactly one charge. Core challenges: storing request-response pairs efficiently, handling concurrent duplicate requests, expiring old keys to reclaim storage, and returning the original response even when the underlying state has changed.

Core Data Model

CREATE TABLE IdempotencyRecord (
    idempotency_key  TEXT NOT NULL,
    endpoint         TEXT NOT NULL,          -- '/v1/charges'
    request_hash     CHAR(64) NOT NULL,       -- SHA-256 of canonical request body
    user_id          UUID,                    -- scope keys per user/tenant
    status           TEXT NOT NULL DEFAULT 'processing',  -- 'processing','done','failed'
    response_code    SMALLINT,               -- HTTP status code
    response_body    JSONB,                  -- original response payload
    created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    completed_at     TIMESTAMPTZ,
    expires_at       TIMESTAMPTZ NOT NULL DEFAULT NOW() + interval '24 hours',
    PRIMARY KEY (idempotency_key, endpoint, user_id)
);
CREATE INDEX idx_idem_expires ON IdempotencyRecord (expires_at);

Idempotency Middleware

import hashlib, json, time
import psycopg2

def idempotency_middleware(conn, idempotency_key: str, endpoint: str, user_id: str,
                            request_body: dict, handler_fn):
    """
    Wraps any mutating API handler with idempotency.
    Returns cached response if key already processed.
    Raises on request body mismatch (different request, same key — client error).
    """
    request_hash = hashlib.sha256(
        json.dumps(request_body, sort_keys=True).encode()
    ).hexdigest()

    # Step 1: Try to insert a 'processing' row (atomic claim)
    with conn.cursor() as cur:
        cur.execute("""
            INSERT INTO IdempotencyRecord
            (idempotency_key, endpoint, request_hash, user_id)
            VALUES (%s, %s, %s, %s)
            ON CONFLICT (idempotency_key, endpoint, user_id) DO NOTHING
            RETURNING status, request_hash, response_code, response_body
        """, (idempotency_key, endpoint, request_hash, user_id))
        result = cur.fetchone()

    if result is None:
        # Conflict: key already exists — fetch the existing record
        with conn.cursor() as cur:
            cur.execute("""
                SELECT status, request_hash, response_code, response_body
                FROM IdempotencyRecord
                WHERE idempotency_key=%s AND endpoint=%s AND user_id=%s
            """, (idempotency_key, endpoint, user_id))
            result = cur.fetchone()

    if result:
        status, stored_hash, resp_code, resp_body = result

        # Mismatch: same key, different request body — not allowed
        if stored_hash != request_hash:
            raise ValueError(
                "Idempotency key already used with a different request body. "
                "Use a new idempotency key for a different request."
            )

        if status == 'done':
            # Return cached response
            return {"from_cache": True, "status_code": resp_code, "body": resp_body}

        if status == 'processing':
            # In-flight duplicate — wait briefly and check again
            for _ in range(10):
                time.sleep(0.5)
                with conn.cursor() as cur:
                    cur.execute(
                        "SELECT status, response_code, response_body FROM IdempotencyRecord "
                        "WHERE idempotency_key=%s AND endpoint=%s AND user_id=%s",
                        (idempotency_key, endpoint, user_id)
                    )
                    r = cur.fetchone()
                if r and r[0] == 'done':
                    return {"from_cache": True, "status_code": r[1], "body": r[2]}
            raise TimeoutError("Concurrent request still processing — retry later")

    conn.commit()  # commit the 'processing' insert

    # Step 2: Execute the actual handler
    try:
        response_code, response_body = handler_fn(conn, request_body)
        _complete_idempotency_record(conn, idempotency_key, endpoint, user_id,
                                      'done', response_code, response_body)
        return {"from_cache": False, "status_code": response_code, "body": response_body}
    except Exception as e:
        _complete_idempotency_record(conn, idempotency_key, endpoint, user_id,
                                      'failed', 500, {"error": str(e)})
        raise

def _complete_idempotency_record(conn, key, endpoint, user_id, status, code, body):
    with conn.cursor() as cur:
        cur.execute("""
            UPDATE IdempotencyRecord
            SET status=%s, response_code=%s, response_body=%s, completed_at=NOW()
            WHERE idempotency_key=%s AND endpoint=%s AND user_id=%s
        """, (status, code, json.dumps(body), key, endpoint, user_id))
    conn.commit()

Key Generation Guidance for API Clients

"""
Idempotency key design principles (for documentation):

Good keys (unique per intent):
  - UUID v4 generated per request attempt: "550e8400-e29b-41d4-a716-446655440000"
  - Content-based: SHA-256 of (user_id + order_id + amount): stable across retries

Bad keys (cause problems):
  - Timestamp-based: different on each retry → different keys → duplicate charges
  - Sequential integers: predictable → attackers can guess and replay
  - Session ID alone: multiple different charges per session → collisions

Stripe's convention: clients generate a UUID per logical operation,
store it alongside the pending request, and reuse it on all retries.
The key is tied to the operation, not the attempt.
"""

import uuid

def generate_idempotency_key(user_id: str, operation: str, resource_id: str) -> str:
    """
    Generate a deterministic idempotency key for a specific operation.
    Same inputs → same key → safe for retries.
    """
    seed = f"{user_id}:{operation}:{resource_id}"
    # UUID v5 is deterministic (namespace + seed → fixed UUID)
    return str(uuid.uuid5(uuid.NAMESPACE_URL, seed))

Expiry and Cleanup

def cleanup_expired_records(conn) -> int:
    """
    Nightly job: delete IdempotencyRecord rows past their expires_at.
    Bounded batch to avoid long lock times.
    """
    deleted = 0
    while True:
        with conn.cursor() as cur:
            cur.execute("""
                DELETE FROM IdempotencyRecord
                WHERE idempotency_key IN (
                    SELECT idempotency_key FROM IdempotencyRecord
                    WHERE expires_at < NOW()
                    LIMIT 1000
                )
                RETURNING 1
            """)
            batch = cur.rowcount
        conn.commit()
        deleted += batch
        if batch < 1000:
            break
    return deleted

Key Interview Points

  • Insert-first atomic claim: The INSERT … ON CONFLICT DO NOTHING pattern atomically claims the idempotency key. Only one concurrent request succeeds in inserting — others get the conflict signal and read the existing record. This prevents duplicate execution even under concurrent retries. Without this atomic claim, two threads can both read “no record found” and both execute the handler.
  • Request body hash check: The same idempotency key with a different request body is a client bug — reject with 422. This prevents accidental key reuse across different operations. The hash is SHA-256 of the canonicalized request body (sorted keys for JSON determinism). Never allow different operations to share a key — a key is tied to one logical operation.
  • Failed request idempotency: If the handler throws an exception, store status=’failed’ in the idempotency record. On retry: should the client get the failure response (replay the failure) or re-execute? Stripe: replays the failure for deterministic behavior. Custom logic: allow retry if the failure was transient (network timeout) but replay for permanent failures (invalid card). Store failure_retryable boolean to distinguish.
  • Scope by user: Include user_id in the primary key. Without scoping, user A could use user B’s idempotency key to see or interfere with their responses. Primary key: (idempotency_key, endpoint, user_id) ensures keys are isolated per user. API gateway should inject user_id from the auth token before the middleware runs.
  • Idempotency key expiry: Stripe uses a 24-hour window — after 24 hours, a key can be reused. For payments, 24 hours is sufficient — retries beyond a day indicate a bug, not a network retry. For longer operations (multi-day provisioning), extend expires_at. The expiry also bounds storage growth: at 1M requests/day × 1KB/record × 1 day = 1GB — manageable with daily cleanup.

Idempotency key and payment deduplication design is discussed in Stripe system design interview questions.

Idempotency key and financial API design is covered in Coinbase system design interview preparation.

Idempotency key and distributed API reliability design is discussed in Amazon system design interview guide.

Scroll to Top