Design a Payment System

Design a payment system like Stripe, PayPal, or an internal payments platform at a company like Uber or Amazon. This problem is notable for what it demands beyond normal system design: exactly-once semantics, financial-grade consistency, regulatory compliance, and reconciliation. Get any of these wrong and you lose money — literally.

Requirements Clarification

  • Operations: Charge a customer, refund a payment, transfer funds between accounts, pay out to vendors/drivers.
  • Scale: 1M transactions/day (~12/sec avg, 100/sec peak). Each transaction: $0.01–$50,000.
  • Correctness guarantees: A payment must be processed exactly once. No double charges. No lost payments.
  • Consistency: Strong — financial data cannot be eventually consistent. A refund must see the original charge.
  • Compliance: PCI DSS (card data), SOX (financial reporting), regional regulations (PSD2 in EU, NACHA for ACH).
  • Third-party PSPs: Integrate with Stripe, Braintree, Adyen for card processing. Internal for bank transfers.

The Core Problem: Exactly-Once Semantics

Network calls can fail in ambiguous ways. You POST a charge to Stripe, the network dies, you don’t know if Stripe processed it or not. If you retry, you might double-charge the customer. If you don’t retry, you might drop the payment.

Idempotency keys solve this. Every payment request includes a unique idempotency key (UUID generated by the client). The payment service:

  1. Stores (idempotency_key, result) in a DB before processing
  2. If a request with the same key arrives, return the stored result — don’t process again
  3. Key expires after 24 hours (or whatever your retry window is)

Stripe uses this exact pattern — every API call accepts Idempotency-Key header. Your internal payment service must implement the same guarantee both for incoming requests and when calling external PSPs.

Architecture

Client → API Gateway → Payment Service → PSP (Stripe/Adyen)
                              ↓
                    Payment DB (PostgreSQL)
                              ↓
                       Ledger Service
                              ↓
                    Ledger DB (append-only)
                              ↓
                    Wallet Service (account balances)

Double-Entry Bookkeeping

Every financial system uses double-entry bookkeeping. Every transaction creates two entries: a debit on one account, a credit on another. The sum of all debits always equals the sum of all credits. If they ever diverge, you have a bug (or fraud).

-- Customer pays $100 for an order
INSERT INTO ledger (account, type, amount, txn_id) VALUES
  ('customer:123', 'debit',  100.00, 'txn_abc'),  -- customer balance decreases
  ('revenue',      'credit', 100.00, 'txn_abc');   -- revenue increases

-- Stripe takes 2.9% + $0.30 fee
INSERT INTO ledger (account, type, amount, txn_id) VALUES
  ('revenue',        'debit',  3.20, 'fee_abc'),
  ('stripe_fees',    'credit', 3.20, 'fee_abc');

The ledger is append-only. Never update or delete ledger entries — only append corrections (a reversal entry). This creates a complete audit trail required by financial regulations.

Payment Service State Machine

A payment goes through defined states:

PENDING → PROCESSING → COMPLETED
                    → FAILED
                    → REFUNDED (from COMPLETED only)

State transitions are stored in the payments table with timestamps. Invalid transitions (e.g., FAILED → COMPLETED) are rejected at the application layer. This prevents race conditions where two concurrent requests try to process the same payment.

-- Atomic state transition with optimistic locking
UPDATE payments
SET status = 'PROCESSING', updated_at = NOW()
WHERE payment_id = ? AND status = 'PENDING';
-- If rows_affected = 0, someone else grabbed it — bail out

PSP Integration

Never store raw card numbers — ever. PCI DSS scope becomes enormous and penalties are severe. Instead:

  1. Card data is entered directly in a PSP-hosted iframe or SDK (Stripe Elements). Raw card data never touches your servers.
  2. PSP returns a token representing the card. You store the token.
  3. For future charges, submit the token to the PSP — they look up the card details.

For bank transfers (ACH/SEPA), you store tokenized bank account details from Plaid or similar open banking APIs.

Reconciliation

Every night (or continuously), compare your internal ledger against PSP settlement reports:

  1. Download settlement file from Stripe/Adyen (CSV with all transactions and fees for the day)
  2. Match each PSP transaction against your ledger by PSP transaction ID
  3. Flag discrepancies: missing transactions, amount mismatches, duplicate entries
  4. Generate exception reports for the finance team

Reconciliation must be automated. At 1M transactions/day, manual reconciliation is impossible. The reconciliation service is typically a batch job running at end-of-day or incrementally every hour.

Database Design

payments:
  payment_id       UUID    PK
  idempotency_key  VARCHAR UNIQUE
  user_id          UUID
  amount           DECIMAL(18,2)   -- never use FLOAT for money
  currency         CHAR(3)         -- ISO 4217: USD, EUR
  status           ENUM('PENDING','PROCESSING','COMPLETED','FAILED','REFUNDED')
  psp_transaction_id VARCHAR        -- Stripe charge ID
  created_at       TIMESTAMP
  updated_at       TIMESTAMP

ledger:
  entry_id         UUID    PK
  txn_id           UUID             -- groups debit+credit entries
  account          VARCHAR          -- 'customer:123', 'revenue', 'stripe_fees'
  type             ENUM('debit','credit')
  amount           DECIMAL(18,2)
  created_at       TIMESTAMP        -- immutable, no updated_at

Never use FLOAT for money. Floating-point arithmetic is non-deterministic for decimal values. Use DECIMAL(18,2) in SQL or store amounts in the smallest currency unit (cents as integers) and convert for display.

Handling Failures

PSP timeout: You called Stripe, got no response. Your payment is PROCESSING. Mark it for async reconciliation — poll Stripe’s API for the charge status, update your payment record when confirmed.

Partial refund: Create a new payment record with type=REFUND linked to the original payment_id. Post reversal entries to the ledger. Never modify the original charge record.

Distributed transaction across services: Charging a user and updating their order status must both succeed or both fail. Use the Saga pattern — a sequence of local transactions with compensating transactions for rollback. If order update fails after a successful charge, run a compensating transaction (refund).

Compliance and Security

  • PCI DSS: Never store CVV (even encrypted). Cardholder data only via tokenization.
  • Fraud detection: ML model runs on every payment: device fingerprint, IP geolocation, transaction velocity, amount anomaly. Suspicious transactions go to manual review queue.
  • Rate limiting: Per-user transaction rate limits prevent stolen card enumeration attacks.
  • Audit logging: Every state transition, admin action, and API call is logged with actor + timestamp. Required for SOX compliance.

Interview Follow-ups

  • How do you handle currency conversion for international payments?
  • Design the payout system for a marketplace like Uber or Etsy — paying sellers daily.
  • How do you detect and prevent payment fraud without adding friction for legitimate users?
  • Your reconciliation finds a $5,000 discrepancy. Walk me through the investigation process.
  • How do you handle strong customer authentication (SCA) requirements under PSD2?

Related System Design Topics

  • SQL vs NoSQL — payment ledgers require ACID transactions; why relational DBs are non-negotiable here
  • Message Queues — async processing of payment events, Saga pattern for distributed transactions
  • API Design — idempotency key patterns, REST vs webhook design for PSP callbacks
  • Database Sharding — sharding the payments table by user_id while keeping ledger globally consistent
  • CAP Theorem — why payment systems choose CP (consistency + partition tolerance) over availability

See also: Design a Hotel / Airbnb Reservation System — the PENDING→CONFIRMED hold pattern with payment integration, and preventing double-charge via idempotency keys.

See also: Design a Stock Trading Platform — trade settlement and portfolio updates apply the same idempotency and double-entry patterns used in payment systems.

See also: ML System Design: Build a Fraud Detection System — the ML layer within payment authorization; three-tier decision architecture with under 100ms latency budget.

Scroll to Top