Payment System Low-Level Design

Payment State Machine

A payment must move through well-defined states. Never allow arbitrary transitions or skipped states.

INITIATED -> PROCESSING -> COMPLETED
                        -> FAILED
                        -> CANCELLED

INITIATED:   payment record created, idempotency key stored
PROCESSING:  submitted to payment processor, awaiting confirmation
COMPLETED:   processor confirmed success, funds moved
FAILED:      processor declined or timeout after retries
CANCELLED:   user or system cancelled before processing completed

Each state transition is a separate database write with a timestamp. The payment_events audit table records every transition. You can reconstruct the full lifecycle of any payment from this log. Never update a payment record in place without appending an event.

Idempotency Keys

The client generates a UUID before initiating a payment and sends it in a header (e.g., Idempotency-Key: uuid). The server stores the key alongside the response.

# Server logic on payment request
def create_payment(request):
    key = request.headers['Idempotency-Key']

    # Check Redis first (hot path)
    cached = redis.get(f"idem:{key}")
    if cached:
        return deserialize(cached)  # return exact same response

    # Check DB (slower, handles Redis eviction)
    existing = db.query("SELECT * FROM payments WHERE idempotency_key = %s", key)
    if existing:
        response = build_response(existing)
        redis.setex(f"idem:{key}", 86400, serialize(response))
        return response

    # New payment - proceed
    payment = process_new_payment(request, key)
    response = build_response(payment)
    redis.setex(f"idem:{key}", 86400, serialize(response))
    return response

Idempotency keys prevent double-charging on network retries. TTL is 24 hours in Redis; also persisted in the DB permanently for audit purposes. The key must be unique per payment attempt, not per user or session.

Database Schema

CREATE TABLE payments (
    id               BIGINT PRIMARY KEY AUTO_INCREMENT,
    idempotency_key  VARCHAR(64) UNIQUE NOT NULL,
    user_id          BIGINT NOT NULL,
    amount           BIGINT NOT NULL,      -- in minor units (cents)
    currency         CHAR(3) NOT NULL,     -- ISO 4217
    status           VARCHAR(20) NOT NULL,
    processor_ref    VARCHAR(100),         -- payment processor transaction ID
    created_at       TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at       TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_user_id (user_id),
    INDEX idx_status_created (status, created_at)
);

CREATE TABLE payment_events (
    id           BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id   BIGINT NOT NULL REFERENCES payments(id),
    from_status  VARCHAR(20),
    to_status    VARCHAR(20) NOT NULL,
    occurred_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    metadata     JSON,
    INDEX idx_payment_id (payment_id)
);

Double-Entry Accounting

Every financial movement has exactly two entries: a debit (money leaves an account) and a credit (money enters an account). The sum of all debits always equals the sum of all credits – books never go out of balance.

CREATE TABLE accounts (
    id           BIGINT PRIMARY KEY,
    name         VARCHAR(100) NOT NULL,
    balance      BIGINT NOT NULL DEFAULT 0,  -- in cents
    account_type VARCHAR(20) NOT NULL         -- asset, liability, revenue
);

CREATE TABLE ledger_entries (
    id              BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id      BIGINT NOT NULL,
    debit_account   BIGINT NOT NULL REFERENCES accounts(id),
    credit_account  BIGINT NOT NULL REFERENCES accounts(id),
    amount          BIGINT NOT NULL,
    created_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Example: user pays $10. Debit user’s wallet account (balance -1000 cents), credit merchant’s account (balance +1000 cents). To verify integrity: SELECT SUM(amount) FROM ledger_entries WHERE debit_account = X must equal account balance change.

Distributed Transaction Problem

The fundamental challenge: you need to call the payment processor API (external, over the network) AND update your database. What happens if your server crashes between these two steps?

  • Crash after calling processor but before DB update: processor charged the user, your DB shows no payment. User is charged, you have no record.
  • Crash after DB update but before calling processor: your DB shows completed payment, processor never ran it. User not charged, you think they were.

Two-phase commit across systems is impractical. The solution is the outbox pattern.

Outbox Pattern

Write to your DB and an outbox table in a single local transaction. A background worker then calls the external processor.

CREATE TABLE outbox (
    id           BIGINT PRIMARY KEY AUTO_INCREMENT,
    payment_id   BIGINT NOT NULL,
    payload      JSON NOT NULL,
    status       VARCHAR(20) DEFAULT 'PENDING',
    created_at   TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    processed_at TIMESTAMP NULL
);

# Application code - single DB transaction
def initiate_payment(payment_data):
    with db.transaction():
        payment = db.insert("INSERT INTO payments ...", payment_data)
        db.insert("INSERT INTO outbox (payment_id, payload) VALUES (%s, %s)",
                  payment.id, serialize(payment_data))
    # Transaction commits - both rows exist or neither does

# Background worker - runs every few seconds
def process_outbox():
    rows = db.query("SELECT * FROM outbox WHERE status = 'PENDING' LIMIT 100")
    for row in rows:
        result = payment_processor.charge(row.payload)  # external call
        db.execute("UPDATE outbox SET status = 'PROCESSED', processed_at = NOW() WHERE id = %s", row.id)
        db.execute("UPDATE payments SET status = %s, processor_ref = %s WHERE id = %s",
                   result.status, result.ref, row.payment_id)

If the worker crashes after calling the processor but before marking the outbox entry processed, it will retry. The processor call must be idempotent (use the payment ID as the processor’s idempotency key). This guarantees at-least-once delivery with idempotent processing = effectively exactly-once.

Webhook Processing

Payment processors send asynchronous webhooks when payment status changes. Handle them defensively.

def handle_webhook(request):
    # 1. Validate HMAC signature
    sig = request.headers['X-Webhook-Signature']
    expected = hmac.new(WEBHOOK_SECRET, request.body, sha256).hexdigest()
    if not hmac.compare_digest(sig, expected):
        return 401

    event = parse(request.body)

    # 2. Idempotent processing using event ID
    if redis.exists(f"webhook:{event.id}"):
        return 200  # already processed

    # 3. Process
    update_payment_status(event.payment_ref, event.status)
    trigger_downstream_events(event)

    # 4. Mark processed
    redis.setex(f"webhook:{event.id}", 86400, "1")
    return 200

Always respond 200 quickly to webhooks. Do heavy processing asynchronously via a queue if needed. Processors will retry on non-200 responses – your handler must be idempotent.

Reconciliation

End-of-day reconciliation catches discrepancies between your system and the payment processor.

  • Payment processor provides a settlement report: list of all transactions, amounts, and statuses for the day
  • Compare against your payments table for the same period
  • Flag discrepancies: payments in processor report not in your DB (you missed a webhook), payments in your DB not in processor report (processor rejected silently), amount mismatches
  • Route flagged records to a manual review queue
  • Run daily, alert on reconciliation failure rate above threshold

Fraud Detection Hooks

Fraud checks run alongside (or just before) payment processing without blocking the happy path.

# Velocity check using Redis sliding window
def check_velocity(user_id, amount):
    key = f"velocity:{user_id}"
    now = time.time()
    window = 3600  # 1 hour

    pipe = redis.pipeline()
    pipe.zremrangebyscore(key, 0, now - window)  # remove old entries
    pipe.zadd(key, {str(now): now})
    pipe.zcard(key)
    pipe.expire(key, window)
    results = pipe.execute()

    tx_count = results[2]
    return tx_count > MAX_TRANSACTIONS_PER_HOUR

# Rule engine checks (synchronous for high-risk, async otherwise)
def fraud_check(payment):
    signals = []
    if payment.amount > HIGH_VALUE_THRESHOLD:
        signals.append('high_value')
    if is_new_country(payment.user_id, payment.ip_country):
        signals.append('new_country')
    if check_velocity(payment.user_id, payment.amount):
        signals.append('velocity_exceeded')

    risk_score = score_signals(signals)

    if risk_score >= BLOCK_THRESHOLD:
        return 'BLOCK'       # synchronous block
    elif risk_score >= REVIEW_THRESHOLD:
        enqueue_for_review(payment)  # async, do not block
        return 'ALLOW'
    return 'ALLOW'

Only block synchronously for the highest-risk signals (velocity exceeded, known fraud patterns). Everything else is async review. Blocking legitimate payments is expensive – false positive rate matters as much as catch rate.

Currency Handling

Never store monetary amounts as floats. Floating-point arithmetic is imprecise and introduces rounding errors that compound over millions of transactions.

  • Store all amounts as integers in the smallest currency unit: $10.99 = 1099 cents, JPY 1000 = 1000 yen (JPY has no subunit)
  • Include a currency code (ISO 4217) alongside every amount
  • Store the number of decimal places per currency in a reference table (USD = 2, JPY = 0, KWD = 3)
  • Only convert to decimal for display, never for storage or calculation
  • Use a Decimal/BigDecimal type if intermediate calculations require decimals, then convert back to integer minor units before storing

Stripe system design interviews deeply cover payment architecture. See design patterns for Stripe interview: payment system design.

Coinbase system design covers transactional payment systems. See design patterns for Coinbase interview: payment and transaction system design.

Robinhood system design covers trade and payment processing. See design patterns for Robinhood interview: financial transaction system design.

Scroll to Top