Amazon processes 100K+ orders per minute on Prime Day. Designing an e-commerce platform covers product catalog management, inventory at scale, transactional checkout, search and discovery, and handling flash sales without overselling. This question appears at Amazon, Shopify, eBay, and any company with marketplace or e-commerce infrastructure.
Requirements
Functional: Browse and search products. View product details (title, price, images, reviews). Add items to cart. Place orders with payment. Inventory decrements on purchase. Order tracking. Flash sales (limited inventory, high traffic spikes).
Non-functional: Product search returns results in <200ms. Checkout completes in <2 seconds. No overselling — inventory must be consistent. Scale: 1M concurrent users, 100K orders/minute on peak. Flash sale: 1M requests/minute for a product with 1000 units.
Product Catalog
The catalog stores millions of SKUs. Data model: Product (product_id, title, description, brand, category_path, images[], attributes{}, created_at). Separate from inventory (price, stock count) to allow price updates without product edits. Storage: primary store in PostgreSQL (relational, consistent) + Elasticsearch for full-text search (auto-synced via Debezium CDC from Postgres). Images stored in S3, served via CDN.
Category hierarchy: electronics → phones → smartphones → iPhones. Stored as a materialized path or nested set in PostgreSQL for efficient subtree queries. Attributes vary by category (phones have RAM/storage, clothing has size/color) — stored as a JSONB column in PostgreSQL.
Inventory Management
Inventory is the most critical consistency requirement: selling more units than in stock is a business failure. Two inventory counts per SKU:
- total_stock: actual units in warehouse
- available_stock: total_stock minus reserved (items in active carts or pending orders)
Reservation model: when a user adds to cart, reserve 1 unit for 15 minutes (soft hold). If checkout completes, convert to a committed order. If the reservation expires (user abandoned cart), release back to available. This prevents overselling without permanently decrementing until payment is confirmed.
-- Atomic reservation using optimistic lock
UPDATE inventory
SET reserved = reserved + 1,
version = version + 1
WHERE product_id = :pid
AND available_stock > 0
AND version = :expected_version;
-- Returns 0 rows if out of stock or version mismatch (concurrent update)
Shopping Cart
Cart is user-specific and ephemeral. Two approaches: (1) Database cart — store cart_items in PostgreSQL, reliable and consistent. (2) Redis cart — cart_id → hash of {product_id: quantity}, sub-millisecond reads. Use Redis for active carts (checkout in progress), periodic flush to PostgreSQL for persistence. Cart TTL: 30 days. Anonymous cart: create a cart session cookie; merge with user cart on login.
Checkout and Payment
Checkout is a multi-step transaction:
- Validate cart (all reserved items still available)
- Calculate total (price × quantity + tax + shipping)
- Process payment (charge credit card via Stripe/PayPal API)
- Decrement inventory (permanent, not reservation)
- Create order record and dispatch fulfillment event
These steps must be atomic — partial success (payment charged but order not created) is a disaster. Use the Saga pattern: each step is reversible. If inventory decrement fails after payment succeeds, issue a refund. Idempotency key: generate a unique checkout_id before starting. Pass it to the payment processor — if the request retries, the same charge_id is returned (no double charge).
class CheckoutSaga:
def execute(self, cart_id, payment_method):
checkout_id = str(uuid.uuid4())
# Step 1: reserve inventory
self.reserve_inventory(cart_id)
try:
# Step 2: charge payment (idempotent with checkout_id)
charge = payment_service.charge(checkout_id, payment_method)
# Step 3: create order
order = order_service.create(cart_id, charge.id)
# Step 4: commit inventory
inventory_service.commit(cart_id)
return order
except PaymentError:
self.release_reservation(cart_id)
raise
except OrderError:
payment_service.refund(charge.id)
self.release_reservation(cart_id)
raise
Product Search
Elasticsearch powers product search. Documents are indexed with: title (analyzed, full-text), brand, category_path, price, average_rating, review_count, availability. Query flow: user searches “bluetooth headphones under $100” → query parser extracts filters (price < 100, likely category=headphones) → Elasticsearch query with full-text match on title + range filter on price → results re-ranked by a personalization model (click-through rate, purchase rate, personal history).
Autocomplete: a separate “search_as_you_type” index with edge n-gram tokenizer. Returns top-10 suggestions in <50ms as user types.
Flash Sale Design
Flash sale: 1000 units of an iPhone, 1M users attempting to buy simultaneously at noon. This is a thundering herd. Naive approach: 1M concurrent requests hit the inventory database → database falls over → everyone gets an error.
Correct approach:
- Queue-based virtual waiting room: at noon, accept all requests into a Redis queue. Issue users a position token. Drain the queue at a controlled rate (e.g., 10K checkouts/minute). Users see “You are #247 in line” instead of an error.
- Pre-loaded Redis inventory: move the 1000-unit count to Redis (DECR is atomic). Each checkout attempt does DECR and checks if result < 0. If < 0, the item is sold out. This handles 100K DECR/second vs ~10K/second for PostgreSQL.
- CDN caching of the “sold out” state: once sold out, cache the sold-out response at CDN for 10 seconds. Eliminates origin load from users checking availability.
- Async order processing: successful DECR gets a success token; order creation happens asynchronously via a task queue. User sees “Order confirmed” immediately; fulfillment follows.
Recommendations
“Customers who bought X also bought Y” (collaborative filtering): computed nightly from purchase history. A co-occurrence matrix of (product_A, product_B) pairs is built from orders containing both items. Top-K co-purchased products are stored per product in Redis for instant retrieval. “Frequently bought together” bundles are pre-computed and stored similarly.
Frequently Asked Questions
How do you prevent overselling in a flash sale?
Preventing overselling during a flash sale requires atomic inventory operations and traffic shaping. The core technique: move the inventory counter to Redis and use DECR (atomic decrement). When a user attempts to purchase, execute DECR on the inventory key. If the result is >= 0, the purchase is valid. If the result goes negative (race with another buyer), issue INCR to restore and reject the purchase. Redis handles 100,000+ atomic DECR operations per second vs 10,000 for PostgreSQL. To prevent the thundering herd from overwhelming backend services, add a virtual waiting room: at the flash sale start time, accept all requests into a queue and issue position tokens. Drain the queue at a controlled rate (e.g., 10,000 checkouts per minute). Users see their queue position instead of errors. Once inventory reaches zero, serve the "sold out" response from CDN cache for 10-30 seconds to absorb the remaining request flood without hitting origin services.
How does the shopping cart work in a large e-commerce system?
Shopping carts require fast reads and writes (every page load may read the cart) combined with eventual consistency with the inventory system. Two-tier storage works well: active carts live in Redis as a hash (cart_id -> {product_id: quantity}), giving sub-millisecond reads. Periodically (every 30 seconds or on significant change), flush to PostgreSQL for durability. Cart TTL is typically 30 days. Anonymous carts use a session cookie as the cart ID; on login, merge the anonymous cart with the user cart (if a product exists in both, take the max quantity or add them, depending on business rules). Inventory reservation: when a cart is created or items added, optionally soft-reserve inventory for 15 minutes (configurable). This prevents showing "add to cart" for items that others are actively checking out, reducing customer frustration from the "out of stock at checkout" experience.
What is the Saga pattern and how does it apply to e-commerce checkout?
The Saga pattern manages distributed transactions across multiple services that each own their own database. In e-commerce checkout, a single purchase requires: validating cart items (inventory service), charging payment (payment service), creating the order (order service), and triggering fulfillment (warehouse service). These span four services with four databases — a traditional two-phase commit (2PC) would require distributed locking across all four, causing high latency and availability risk. The Saga pattern instead executes each step sequentially, and defines a compensating transaction for each step to undo it on failure. If payment succeeds but order creation fails, issue a payment refund. If order creation succeeds but fulfillment fails, mark the order as "fulfillment error" and alert operations. Each step uses an idempotency key (the checkout_id) so retries do not double-charge or double-create. The orchestration variant (a central saga orchestrator) is preferred for checkout because it makes the workflow visible and debuggable — the orchestrator tracks the exact state of each saga instance.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do you prevent overselling in a flash sale?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Preventing overselling during a flash sale requires atomic inventory operations and traffic shaping. The core technique: move the inventory counter to Redis and use DECR (atomic decrement). When a user attempts to purchase, execute DECR on the inventory key. If the result is >= 0, the purchase is valid. If the result goes negative (race with another buyer), issue INCR to restore and reject the purchase. Redis handles 100,000+ atomic DECR operations per second vs 10,000 for PostgreSQL. To prevent the thundering herd from overwhelming backend services, add a virtual waiting room: at the flash sale start time, accept all requests into a queue and issue position tokens. Drain the queue at a controlled rate (e.g., 10,000 checkouts per minute). Users see their queue position instead of errors. Once inventory reaches zero, serve the “sold out” response from CDN cache for 10-30 seconds to absorb the remaining request flood without hitting origin services.”
}
},
{
“@type”: “Question”,
“name”: “How does the shopping cart work in a large e-commerce system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Shopping carts require fast reads and writes (every page load may read the cart) combined with eventual consistency with the inventory system. Two-tier storage works well: active carts live in Redis as a hash (cart_id -> {product_id: quantity}), giving sub-millisecond reads. Periodically (every 30 seconds or on significant change), flush to PostgreSQL for durability. Cart TTL is typically 30 days. Anonymous carts use a session cookie as the cart ID; on login, merge the anonymous cart with the user cart (if a product exists in both, take the max quantity or add them, depending on business rules). Inventory reservation: when a cart is created or items added, optionally soft-reserve inventory for 15 minutes (configurable). This prevents showing “add to cart” for items that others are actively checking out, reducing customer frustration from the “out of stock at checkout” experience.”
}
},
{
“@type”: “Question”,
“name”: “What is the Saga pattern and how does it apply to e-commerce checkout?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The Saga pattern manages distributed transactions across multiple services that each own their own database. In e-commerce checkout, a single purchase requires: validating cart items (inventory service), charging payment (payment service), creating the order (order service), and triggering fulfillment (warehouse service). These span four services with four databases — a traditional two-phase commit (2PC) would require distributed locking across all four, causing high latency and availability risk. The Saga pattern instead executes each step sequentially, and defines a compensating transaction for each step to undo it on failure. If payment succeeds but order creation fails, issue a payment refund. If order creation succeeds but fulfillment fails, mark the order as “fulfillment error” and alert operations. Each step uses an idempotency key (the checkout_id) so retries do not double-charge or double-create. The orchestration variant (a central saga orchestrator) is preferred for checkout because it makes the workflow visible and debuggable — the orchestrator tracks the exact state of each saga instance.”
}
}
]
}