Inbox Pattern Low-Level Design: Exactly-Once Message Processing

The inbox pattern is the receiving counterpart to the outbox pattern. Where the outbox ensures a service reliably publishes events it produces, the inbox ensures a service reliably processes events it receives — exactly once, even when the message broker delivers the same message multiple times (at-least-once delivery). Without an inbox, a consumer that crashes mid-processing will re-process the message on restart, potentially duplicating side effects like charging a credit card or sending an email.

Core Data Model

CREATE TABLE InboxMessage (
    message_id      VARCHAR(255) PRIMARY KEY,  -- unique ID from the message broker
    source          VARCHAR(100) NOT NULL,      -- which service/topic produced this
    event_type      VARCHAR(100) NOT NULL,      -- 'order.created', 'payment.completed'
    payload         JSONB NOT NULL,
    status          VARCHAR(20) NOT NULL DEFAULT 'pending',
    -- PENDING -> PROCESSING -> PROCESSED | FAILED
    received_at     TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    processed_at    TIMESTAMPTZ,
    attempt_count   INT NOT NULL DEFAULT 0,
    error_message   TEXT
);

CREATE INDEX idx_inbox_status ON InboxMessage(status, received_at)
    WHERE status IN ('pending', 'failed');

Idempotent Message Processing

def handle_message(message_id: str, source: str, event_type: str, payload: dict):
    """
    Called by the Kafka/RabbitMQ consumer for each incoming message.
    Guarantees exactly-once processing via the inbox table.
    """
    with db.transaction():
        # Attempt to insert the message ID into the inbox
        # ON CONFLICT DO NOTHING: if already processed, skip silently
        affected = db.execute("""
            INSERT INTO InboxMessage (message_id, source, event_type, payload, status)
            VALUES (%s, %s, %s, %s, 'pending')
            ON CONFLICT (message_id) DO NOTHING
        """, [message_id, source, event_type, json.dumps(payload)])

        if affected == 0:
            # Duplicate delivery -- already processed or in progress
            return

        # Mark as processing (within same transaction)
        db.execute("""
            UPDATE InboxMessage
            SET status = 'processing', attempt_count = attempt_count + 1
            WHERE message_id = %s
        """, [message_id])

    # Business logic runs outside the inbox transaction
    # so the inbox row is committed before we start processing
    try:
        process_event(event_type, payload)
        db.execute("""
            UPDATE InboxMessage
            SET status = 'processed', processed_at = NOW()
            WHERE message_id = %s
        """, [message_id])
    except Exception as e:
        db.execute("""
            UPDATE InboxMessage
            SET status = 'failed', error_message = %s
            WHERE message_id = %s
        """, [str(e), message_id])

Transactional Inbox (Atomic with Business Data)

For maximum reliability, wrap both the inbox INSERT and the business operation in a single transaction:

def handle_order_created(message_id: str, payload: dict):
    with db.transaction():
        # Idempotency guard
        affected = db.execute("""
            INSERT INTO InboxMessage (message_id, source, event_type, payload, status)
            VALUES (%s, 'order-service', 'order.created', %s, 'processed')
            ON CONFLICT (message_id) DO NOTHING
        """, [message_id, json.dumps(payload)])

        if affected == 0:
            return  # duplicate

        # Business operation in the same transaction -- atomic with idempotency guard
        db.execute("""
            INSERT INTO Fulfillment (order_id, status, created_at)
            VALUES (%s, 'pending', NOW())
            ON CONFLICT (order_id) DO NOTHING
        """, [payload['order_id']])
        # If this transaction commits, both the inbox record and fulfillment row exist.
        # If it rolls back, neither exists. Broker will redeliver; we process again safely.

Retry and Dead-Letter Queue

def retry_failed_messages(max_attempts: int = 3):
    """Background job: retry failed inbox messages with exponential backoff."""
    failed = db.fetchall("""
        SELECT * FROM InboxMessage
        WHERE status = 'failed'
          AND attempt_count  NOW() - INTERVAL '24 hours'
        ORDER BY received_at ASC
        LIMIT 50
        FOR UPDATE SKIP LOCKED
    """, [max_attempts])

    for msg in failed:
        backoff = min(30 * (4 ** msg['attempt_count']), 3600)
        # Only retry if enough time has passed since last attempt
        db.execute("""
            UPDATE InboxMessage SET status = 'pending'
            WHERE message_id = %s
              AND processed_at IS NULL  -- not already succeeded in parallel
        """, [msg['message_id']])

def move_to_dlq():
    """Move exhausted messages to dead-letter table for manual inspection."""
    db.execute("""
        INSERT INTO InboxDeadLetter
        SELECT * FROM InboxMessage
        WHERE status = 'failed' AND attempt_count >= 3
    """)

Key Interview Points

  • The inbox PRIMARY KEY on message_id plus ON CONFLICT DO NOTHING is the entire idempotency mechanism — it’s a deduplication table keyed by the broker’s message ID.
  • Message brokers guarantee at-least-once delivery, not exactly-once. The inbox pattern converts at-least-once delivery into exactly-once business effect.
  • Inbox vs outbox: outbox is about reliable publishing (you wrote to DB, you must publish to broker). Inbox is about reliable consuming (broker delivered to you, you must process exactly once). Use both for end-to-end reliability.
  • The transactional inbox (business operation in the same DB transaction as the inbox INSERT) is the strongest form — no window where the inbox record exists but the business effect hasn’t happened yet.
  • Prune processed inbox messages after a retention period (e.g., 7 days) to prevent unbounded table growth. Keep enough history to deduplicate late-arriving duplicates.
  • Alert on inbox dead-letter queue depth — failed messages represent lost business events that need investigation.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How does the inbox pattern achieve exactly-once processing from an at-least-once broker?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Message brokers (Kafka, RabbitMQ, SQS) guarantee at-least-once delivery: they redeliver messages if the consumer crashes before ACKing. This means the consumer’s processing logic may run multiple times for the same message. The inbox pattern adds a deduplication layer: before processing, attempt to INSERT the message_id into an InboxMessage table with a UNIQUE constraint (ON CONFLICT DO NOTHING). If the insert succeeds (1 row affected), this is the first delivery — process it. If the insert fails (0 rows affected), it is a duplicate delivery — skip it. The message_id is the broker-assigned unique identifier (Kafka offset+partition, SQS MessageId, RabbitMQ delivery tag). This converts at-least-once delivery into exactly-once business effect.”}},{“@type”:”Question”,”name”:”What is the transactional inbox and when should you use it?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The transactional inbox wraps both the InboxMessage INSERT and the business operation in a single database transaction. Either both the deduplication record AND the business effect commit together, or neither does. Without the transactional inbox, there is a window after the InboxMessage INSERT commits but before the business operation runs where a crash leaves the system in an inconsistent state (deduplication record exists but business effect did not happen). This is safe if your business operation is idempotent — you can retry it. But for non-idempotent operations (sending an email, charging a card), use the transactional inbox to ensure atomicity. Requirement: the business operation must use the same database as the inbox table.”}},{“@type”:”Question”,”name”:”How is the inbox pattern different from the outbox pattern?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The outbox pattern solves the publishing problem: you write to your database and need to reliably publish an event to a message broker without a distributed transaction between your DB and the broker. You write the event to an Outbox table in the same DB transaction as your business data; a relay publishes it to Kafka. The inbox pattern solves the consuming problem: the broker delivers a message to you, and you need to process it exactly once despite potential redeliveries. You record the message_id in an Inbox table as a deduplication guard before processing. Use both patterns together for end-to-end reliability: outbox ensures every event is published; inbox ensures every event is processed exactly once.”}},{“@type”:”Question”,”name”:”How do you handle inbox message failures and retries?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”When processing fails, update the InboxMessage status to FAILED with the error_message. A background retry job queries FAILED messages with attempt_count below the max (typically 3) and resets them to PENDING with exponential backoff (30s, 2m, 8m). After max attempts, move to a dead-letter table (InboxDeadLetter) for manual inspection and alert the on-call team. Critical: the retry loop must also use FOR UPDATE SKIP LOCKED so multiple retry workers don’t double-retry the same message. The idempotency guard in the processing path handles the case where a retry of a FAILED message races with a duplicate delivery of the same message — the second insert attempt is blocked by the existing row.”}},{“@type”:”Question”,”name”:”How do you prevent the inbox table from growing unboundedly?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Add a retention cleanup job: DELETE FROM InboxMessage WHERE status = ‘processed’ AND processed_at < NOW() – INTERVAL ‘7 days’. Run nightly. Keep 7 days of processed records to deduplicate late-arriving duplicates (some brokers redeliver messages hours or days after the original delivery in edge cases). For very high-throughput systems (millions of messages/day), partition the InboxMessage table by received_at date (PostgreSQL declarative partitioning) so the cleanup job can DROP old partitions instead of running DELETE row-by-row. DROP PARTITION is O(1) and does not bloat the table. Index the received_at column with a partial index on status to keep cleanup queries fast.”}}]}

Inbox pattern and reliable payment event processing is discussed in Stripe system design interview questions.

Inbox pattern and order event processing reliability is covered in Shopify system design interview preparation.

Inbox pattern and exactly-once message processing design is discussed in Uber system design interview guide.

Scroll to Top