Checkout Service Low-Level Design
The checkout service orchestrates the final steps of a purchase: reserving inventory, creating an order record, and driving payment to completion. It must handle concurrent shoppers racing for limited stock, partial failures in downstream services, and user-visible latency requirements typically under two seconds.
Requirements
Functional
- Lock cart items against concurrent purchase for the duration of checkout.
- Create an order atomically once payment authorization succeeds.
- Orchestrate payment, inventory deduction, and fulfillment trigger as a saga.
- Compensate each completed step if a later step fails (cancel auth, release stock).
- Support guest and authenticated checkout paths.
Non-Functional
- Checkout initiation to order confirmation under 2 seconds at the 95th percentile.
- Zero oversell: stock reservation must be atomic.
- Idempotent retry on any step without double-charging or double-deducting inventory.
Data Model
- cart: id, user_id (nullable for guest), session_token, status (OPEN | LOCKED | CHECKED_OUT | EXPIRED), locked_at, expires_at.
- cart_item: id, cart_id, sku_id, quantity, unit_price_snapshot, currency.
- stock_reservation: id, cart_id, sku_id, quantity, reserved_at, expires_at, released_at.
- checkout_session: id, cart_id, idempotency_key (UNIQUE), saga_state (JSONB), status (PENDING | PAYMENT_PENDING | CONFIRMED | FAILED | COMPENSATING), created_at, updated_at.
- order: id, checkout_session_id, user_id, total_amount, currency, status, created_at.
- order_item: id, order_id, sku_id, quantity, unit_price, line_total.
Core Algorithms and Flows
Cart Locking
When the buyer initiates checkout the service acquires a distributed lock (Redis SET NX PX) keyed on cart_id for a 10-minute TTL. It then reserves stock for each cart item by decrementing an atomic counter in Redis backed by a Postgres write. If any SKU is out of stock the lock is released immediately and the buyer sees an out-of-stock error before any charge is attempted.
Checkout Saga
The saga is an orchestrator-based workflow stored in checkout_session.saga_state. Steps execute in order with each step writing its outcome before proceeding:
- Step 1 — Reserve Stock: decrement inventory; compensation releases the reservation.
- Step 2 — Authorize Payment: call payment gateway with checkout_session_id as idempotency key; compensation voids the auth.
- Step 3 — Create Order: insert order and order_item rows inside a single database transaction.
- Step 4 — Trigger Fulfillment: publish an ORDER_CONFIRMED event to the fulfillment topic; compensation publishes ORDER_CANCELLED.
A background sweeper detects stalled sessions older than 15 minutes and drives them to the COMPENSATING state, executing compensations in reverse order.
Atomic Order Creation
Order creation uses a Postgres transaction that also marks the checkout_session as CONFIRMED. Because the payment authorization step already succeeded, the only failure mode here is a database crash. On restart the saga re-reads the saga_state, detects Step 3 incomplete, and retries the insert using the same order id (generated at saga start) making the insert idempotent via ON CONFLICT DO NOTHING.
API Design
POST /v1/checkout/sessions— initiates checkout for a cart, returns session_id and payment_client_secret.POST /v1/checkout/sessions/{id}/confirm— buyer submits payment method; triggers saga execution.GET /v1/checkout/sessions/{id}— polls current saga status; used by front-end to detect completion or failure.DELETE /v1/checkout/sessions/{id}— buyer abandons checkout; releases stock locks and cart lock.
Scalability Considerations
- Stock reservation counters in Redis provide sub-millisecond contention resolution; Postgres is the source of truth and is reconciled asynchronously.
- Saga orchestrator is stateless; any instance can resume a session by reading saga_state from Postgres, enabling horizontal scaling without sticky routing.
- Cart locks have short TTLs and are not held across payment network round trips — the lock covers only the reservation phase, reducing contention windows.
- Downstream services (payment, fulfillment) are called with timeouts and retried with exponential backoff; circuit breakers prevent checkout from amplifying load during outages.
- Read replicas serve cart reads and order confirmation pages, keeping writes on the primary small in volume.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Shopify Interview Guide
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems