Payment System Low-Level Design

Payment State Machine

A payment must move through well-defined states. Never allow arbitrary transitions or skipped states.

INITIATED -> PROCESSING -> COMPLETED
                        -> FAILED
                        -> CANCELLED

INITIATED:   payment record created, idempotency key stored
PROCESSING:  submitted to payment processor, awaiting confirmation
COMPLETED:   processor confirmed success, funds moved
FAILED:      processor declined or timeout after retries
CANCELLED:   user or system cancelled before processing completed

Each state transition is a separate database write with a timestamp. The payment_events audit table records every transition. You can reconstruct the full lifecycle of any payment from this log. Never update a payment record in place without appending an event.

Idempotency Keys

The client generates a UUID before initiating a payment and sends it in a header (e.g., Idempotency-Key: uuid). The server stores the key alongside the response.

# Server logic on payment request
def create_payment(request):
    key = request.headers['Idempotency-Key']

    # Check Redis first (hot path)
    cached = redis.get(f"idem:{key}")
    if cached:
        return deserialize(cached)  # return exact same response

    # Check DB (slower, handles Redis eviction)
    existing = db.query("SELECT * FROM payments WHERE idempotency_key = %s", key)
    if existing:
        response = build_response(existing)
        redis.setex(f"idem:{key}", 86400, serialize(response))
        return response

    # New payment - proceed
    payment = process_new_payment(request, key)
    response = build_response(payment)
    redis.setex(f"idem:{key}", 86400, serialize(response))
    return response

Idempotency keys prevent double-charging on network retries. TTL is 24 hours in Redis; also persisted in the DB permanently for audit purposes. The key must be unique per payment attempt, not per user or session.

Database Schema

CREATE TABLE payments (
    id               BIGINT PRIMARY KEY AUTO_INCREMENT,
    idempotency_key  VARCHAR(64) UNIQUE NOT NULL,
    user_id          BIGINT NOT NULL,
    amount           BIGINT NOT NULL,      -- in minor units (cents)
    currency         CHAR(3) NOT NULL,     -- ISO 4217
    status           VARCHAR(20) NOT NULL,
    processor_ref    VARCHAR(100),         -- payment processor transaction ID
    created_at       TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at       TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_user_id (user_id),
    INDEX idx_status_created (status, created_at)
);

CREATE TABLE payment_events (
    id           BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id   BIGINT NOT NULL REFERENCES payments(id),
    from_status  VARCHAR(20),
    to_status    VARCHAR(20) NOT NULL,
    occurred_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    metadata     JSON,
    INDEX idx_payment_id (payment_id)
);

Double-Entry Accounting

Every financial movement has exactly two entries: a debit (money leaves an account) and a credit (money enters an account). The sum of all debits always equals the sum of all credits – books never go out of balance.

CREATE TABLE accounts (
    id           BIGINT PRIMARY KEY,
    name         VARCHAR(100) NOT NULL,
    balance      BIGINT NOT NULL DEFAULT 0,  -- in cents
    account_type VARCHAR(20) NOT NULL         -- asset, liability, revenue
);

CREATE TABLE ledger_entries (
    id              BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id      BIGINT NOT NULL,
    debit_account   BIGINT NOT NULL REFERENCES accounts(id),
    credit_account  BIGINT NOT NULL REFERENCES accounts(id),
    amount          BIGINT NOT NULL,
    created_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Example: user pays $10. Debit user’s wallet account (balance -1000 cents), credit merchant’s account (balance +1000 cents). To verify integrity: SELECT SUM(amount) FROM ledger_entries WHERE debit_account = X must equal account balance change.

Distributed Transaction Problem

The fundamental challenge: you need to call the payment processor API (external, over the network) AND update your database. What happens if your server crashes between these two steps?

  • Crash after calling processor but before DB update: processor charged the user, your DB shows no payment. User is charged, you have no record.
  • Crash after DB update but before calling processor: your DB shows completed payment, processor never ran it. User not charged, you think they were.

Two-phase commit across systems is impractical. The solution is the outbox pattern.

Outbox Pattern

Write to your DB and an outbox table in a single local transaction. A background worker then calls the external processor.

CREATE TABLE outbox (
    id           BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id   BIGINT NOT NULL,
    payload      JSON NOT NULL,
    status       VARCHAR(20) DEFAULT 'PENDING',
    created_at   TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    processed_at TIMESTAMP NULL
);

# Application code - single DB transaction
def initiate_payment(payment_data):
    with db.transaction():
        payment = db.insert("INSERT INTO payments ...", payment_data)
        db.insert("INSERT INTO outbox (payment_id, payload) VALUES (%s, %s)",
                  payment.id, serialize(payment_data))
    # Transaction commits - both rows exist or neither does

# Background worker - runs every few seconds
def process_outbox():
    rows = db.query("SELECT * FROM outbox WHERE status = 'PENDING' LIMIT 100")
    for row in rows:
        result = payment_processor.charge(row.payload)  # external call
        db.execute("UPDATE outbox SET status = 'PROCESSED', processed_at = NOW() WHERE id = %s", row.id)
        db.execute("UPDATE payments SET status = %s, processor_ref = %s WHERE id = %s",
                   result.status, result.ref, row.payment_id)

If the worker crashes after calling the processor but before marking the outbox entry processed, it will retry. The processor call must be idempotent (use the payment ID as the processor’s idempotency key). This guarantees at-least-once delivery with idempotent processing = effectively exactly-once.

Webhook Processing

Payment processors send asynchronous webhooks when payment status changes. Handle them defensively.

def handle_webhook(request):
    # 1. Validate HMAC signature
    sig = request.headers['X-Webhook-Signature']
    expected = hmac.new(WEBHOOK_SECRET, request.body, sha256).hexdigest()
    if not hmac.compare_digest(sig, expected):
        return 401

    event = parse(request.body)

    # 2. Idempotent processing using event ID
    if redis.exists(f"webhook:{event.id}"):
        return 200  # already processed

    # 3. Process
    update_payment_status(event.payment_ref, event.status)
    trigger_downstream_events(event)

    # 4. Mark processed
    redis.setex(f"webhook:{event.id}", 86400, "1")
    return 200

Always respond 200 quickly to webhooks. Do heavy processing asynchronously via a queue if needed. Processors will retry on non-200 responses – your handler must be idempotent.

Reconciliation

End-of-day reconciliation catches discrepancies between your system and the payment processor.

  • Payment processor provides a settlement report: list of all transactions, amounts, and statuses for the day
  • Compare against your payments table for the same period
  • Flag discrepancies: payments in processor report not in your DB (you missed a webhook), payments in your DB not in processor report (processor rejected silently), amount mismatches
  • Route flagged records to a manual review queue
  • Run daily, alert on reconciliation failure rate above threshold

Fraud Detection Hooks

Fraud checks run alongside (or just before) payment processing without blocking the happy path.

# Velocity check using Redis sliding window
def check_velocity(user_id, amount):
    key = f"velocity:{user_id}"
    now = time.time()
    window = 3600  # 1 hour

    pipe = redis.pipeline()
    pipe.zremrangebyscore(key, 0, now - window)  # remove old entries
    pipe.zadd(key, {str(now): now})
    pipe.zcard(key)
    pipe.expire(key, window)
    results = pipe.execute()

    tx_count = results[2]
    return tx_count > MAX_TRANSACTIONS_PER_HOUR

# Rule engine checks (synchronous for high-risk, async otherwise)
def fraud_check(payment):
    signals = []
    if payment.amount > HIGH_VALUE_THRESHOLD:
        signals.append('high_value')
    if is_new_country(payment.user_id, payment.ip_country):
        signals.append('new_country')
    if check_velocity(payment.user_id, payment.amount):
        signals.append('velocity_exceeded')

    risk_score = score_signals(signals)

    if risk_score >= BLOCK_THRESHOLD:
        return 'BLOCK'       # synchronous block
    elif risk_score >= REVIEW_THRESHOLD:
        enqueue_for_review(payment)  # async, do not block
        return 'ALLOW'
    return 'ALLOW'

Only block synchronously for the highest-risk signals (velocity exceeded, known fraud patterns). Everything else is async review. Blocking legitimate payments is expensive – false positive rate matters as much as catch rate.

Currency Handling

Never store monetary amounts as floats. Floating-point arithmetic is imprecise and introduces rounding errors that compound over millions of transactions.

  • Store all amounts as integers in the smallest currency unit: $10.99 = 1099 cents, JPY 1000 = 1000 yen (JPY has no subunit)
  • Include a currency code (ISO 4217) alongside every amount
  • Store the number of decimal places per currency in a reference table (USD = 2, JPY = 0, KWD = 3)
  • Only convert to decimal for display, never for storage or calculation
  • Use a Decimal/BigDecimal type if intermediate calculations require decimals, then convert back to integer minor units before storing


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you prevent double-charging in a payment system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use idempotency keys. The client generates a unique UUID for each payment attempt (e.g., UUID v4) and sends it with the request. The server stores the idempotency_key and the response in both Redis (TTL 24h, fast lookup) and the DB (permanent audit trail). On duplicate requests with the same key, return the cached response without processing the payment again. The client must retry with the same key – never generate a new key for a retry of the same payment intent. Idempotency keys are scoped to the user: user A and user B can have the same key without conflict. On the server, use INSERT … ON CONFLICT DO NOTHING for the idempotency record, then check if it was inserted or already existed.”}},{“@type”:”Question”,”name”:”What is the outbox pattern and why is it needed for payments?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The dual-write problem: a server must update its DB and call the payment processor. If it writes to DB first and then crashes before calling the processor, the payment is not processed but the DB shows it is. If it calls the processor first and then crashes before writing to DB, the payment is processed but the DB has no record. The outbox pattern solves this: write a pending payment record and an outbox entry in a single DB transaction. A background worker reads the outbox, calls the payment processor, and marks the outbox entry as processed (with the processor response). Since the outbox is written atomically with the payment record, the worker can always retry crashed calls. The processor call must be idempotent (pass the payment ID as idempotency key to the processor API).”}},{“@type”:”Question”,”name”:”Why do payment systems use integer amounts instead of decimals?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Floating-point numbers cannot represent most decimal fractions exactly: 0.1 + 0.2 = 0.30000000000000004 in IEEE 754. In financial calculations, these errors accumulate and can result in rounding discrepancies – unacceptable in a system where every cent must balance. Store all monetary amounts as integers in the smallest currency unit: $10.99 = 1099 cents, 1.5 BTC = 150000000 satoshis. All arithmetic (addition, subtraction, comparisons) is then exact integer arithmetic. Apply currency-specific rounding rules only at display time and at the boundary with external systems. Never store money as FLOAT or DOUBLE in the database – use INTEGER or DECIMAL (which represents exact decimal fractions).”}},{“@type”:”Question”,”name”:”How does double-entry accounting work in a payment system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Every financial transaction has exactly two sides: a debit (money leaving an account) and a credit (money entering an account). For a payment of $100 from user A to merchant B: debit user_A_balance by 100, credit merchant_B_balance by 100. The sum of all debits always equals the sum of all credits – this is the accounting invariant. Benefits: (1) The books always balance, making discrepancies detectable. (2) Full audit trail – every balance change has a corresponding matching entry. (3) Supports complex flows (escrow, refunds, fees) by adding intermediate accounts. Implementation: transactions table with (debit_account_id, credit_account_id, amount, reference). Never UPDATE balances directly – only INSERT new transactions and derive balances from transaction history.”}},{“@type”:”Question”,”name”:”How do you process payment processor webhooks reliably?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Webhooks arrive asynchronously and may be delivered out of order or more than once. Five steps: (1) Immediately return 200 OK to acknowledge receipt – do not process synchronously. (2) Write the raw webhook payload to a webhook_events table with the event_id and source. Use INSERT … ON CONFLICT DO NOTHING on event_id for idempotency. (3) Validate the HMAC signature (provider sends a signature over the payload using a shared secret). Reject if invalid. (4) Enqueue the event_id for async processing. (5) Worker processes: look up the event_id, check if already processed (status=done), update payment status, trigger downstream actions (send receipt email, release escrow). Mark event as processed. This handles duplicate deliveries, crashes during processing, and out-of-order events.”}}]}

Stripe system design interviews deeply cover payment architecture. See design patterns for Stripe interview: payment system design.

Coinbase system design covers transactional payment systems. See design patterns for Coinbase interview: payment and transaction system design.

Robinhood system design covers trade and payment processing. See design patterns for Robinhood interview: financial transaction system design.

Scroll to Top