System Design Interview: E-Commerce Product Search (Amazon Scale)

Q: How does Amazon product search rank results?

Amazon uses A9, a multi-signal ranking system. The base signals are: keyword relevance (BM25 match score across title, description, features), sales velocity (products selling more rank higher — implicit quality signal), customer reviews and ratings, price competitiveness, and Prime eligibility. Personalization adjusts rankings based on the user purchase history (brand affinity, price range preferences) and click history. Profit margin is also a factor — Amazon is a business. These signals are combined in a Learning to Rank model (LambdaMART or neural network) trained on historical click-through and purchase data. A/B testing continuously refines the ranking formula.

Q: What is faceted search and how is it implemented?

Faceted search lets users filter results by multiple dimensions simultaneously (brand, price range, rating, category) and shows how many results match each option. Implementation uses Elasticsearch aggregations: alongside the main search query, you run term aggregations (count documents per brand, per rating bucket) and range aggregations (count documents in each price range). The tricky part is that each facet should show counts for results matching all OTHER active filters — for example, the brand facet shows how many results exist in each brand for the current price and rating filters, even though no brand filter is applied. This is called "post-filter" pattern in Elasticsearch.

Q: How do you handle real-time inventory in product search?

Out-of-stock products should not appear prominently (or at all) in search results, but inventory changes constantly — millions of items going in and out of stock every day. The approach: inventory updates flow through a Kafka topic from the warehouse system to an Elasticsearch updater service, which applies partial document updates to set in_stock=false within 5-30 seconds. For flash sales or high-stakes inventory (last unit), synchronous inventory checks against a Redis cache happen at the final ranking stage, not in Elasticsearch, to avoid stale reads. Products with zero inventory are filtered out or ranked to the bottom using an Elasticsearch filter clause.

⏱ 7 min read

Product search at Amazon or eBay is one of the most complex search systems in the world. It must combine keyword relevance, faceted filtering (price, category, brand, rating), personalized ranking, real-time inventory, and autocomplete — all in under 200ms for billions of products and hundreds of millions of users.

Functional Requirements

Full-text search across product title, description, brand, category
Faceted filtering: price range, category, brand, rating, Prime eligible, seller
Ranking: relevance, customer reviews, sales velocity, profit margin, personalization
Autocomplete: real-time query suggestions as user types
Spelling correction: “iphone chargar” → “iphone charger”
Synonym expansion: “sofa” → also searches “couch”, “loveseat”

Scale

Metric	Target
Products indexed	350+ million (Amazon)
Search QPS peak	~100,000/s
End-to-end latency	< 200ms P99
Index update lag	< 5 minutes for new listings

Search Index Architecture

Document Structure

// Product document in Elasticsearch
{
  "product_id": "B08F5V7T9X",
  "title": "Apple AirPods Pro 2nd Generation with MagSafe",
  "brand": "Apple",
  "category_path": ["Electronics", "Headphones", "Earbuds"],
  "price": 249.00,
  "rating": 4.7,
  "review_count": 85234,
  "is_prime": true,
  "in_stock": true,
  "sales_rank_category": 3,    // rank in category by sales
  "description": "...",
  "features": ["Active Noise Cancellation", "Transparency Mode", ...],
  "embeddings": [0.23, -0.11, ...],   // 768-dim vector for semantic search
  "seller_id": "A1EXMPLE",
  "created_at": "2023-09-12T00:00:00Z"
}

Two-Phase Retrieval and Ranking

Phase 1 — RETRIEVAL (fast, broad):
  Input: user query "wireless earbuds noise cancelling"

  BM25 retrieval:
    Inverted index lookup for each query term
    Score = TF-IDF weighted match across title, features, description
    → top 10,000 candidates in < 20ms

  Semantic retrieval (dense):
    Encode query → 768-dim embedding
    ANN search (FAISS/ScaNN) for semantically similar products
    → top 2,000 candidates in < 15ms

  Merge and deduplicate: ~12,000 candidates

Phase 2 — RANKING (expensive model, small candidate set):
  LambdaMART or neural LTR model
  Features: BM25 score, semantic score, sales rank, rating, price,
            click-through rate history, user profile affinity,
            query-product historical CTR
  → top 48 products ranked in < 30ms

Elasticsearch Query for Product Search

GET /products/_search
{
  "query": {
    "bool": {
      "must": {
        "multi_match": {
          "query": "wireless earbuds noise cancelling",
          "fields": ["title^3", "brand^2", "features^2", "description"],
          "type": "best_fields",
          "fuzziness": "AUTO"    // handles typos
        }
      },
      "filter": [
        { "term": { "in_stock": true } },
        { "range": { "price": { "gte": 50, "lte": 500 } } },
        { "term": { "is_prime": true } }
      ]
    }
  },
  "aggs": {
    "by_brand":  { "terms": { "field": "brand", "size": 20 } },
    "by_rating": { "range": { "field": "rating",
                   "ranges": [{"from": 4}, {"from": 3, "to": 4}] } },
    "price_stats":{ "stats": { "field": "price" } }
  },
  "sort": [
    { "_score": "desc" },
    { "sales_rank_category": "asc" }
  ],
  "size": 48
}

Faceted Filtering

The aggregations in the query above compute facet counts — how many products match each filter option. Facets update dynamically as the user applies filters, so all filters must apply to both results and facet counts except the facet being counted (to show how many results each option would yield).

Autocomplete and Query Suggestions

# Autocomplete powered by:
# 1. Query logs: most frequent queries starting with the typed prefix
# 2. Elasticsearch completion suggester (FST-based, very fast)
# 3. Query-to-category mapping for rich suggestions

# Elasticsearch completion suggester
PUT /search_suggest
{
  "mappings": {
    "properties": {
      "query_text": {
        "type": "completion",
        "analyzer": "simple"
      },
      "weight": { "type": "integer" }  // search frequency score
    }
  }
}

# Suggest endpoint (returns in < 5ms)
GET /search_suggest/_search
{
  "suggest": {
    "query_suggest": {
      "prefix": "wireless ear",
      "completion": {
        "field": "query_text",
        "size": 10,
        "fuzzy": { "fuzziness": 1 }
      }
    }
  }
}
# Returns: ["wireless earbuds", "wireless earphones", "wireless earbud case", ...]

Personalized Ranking (Amazon A9)

The same search query returns different results for different users. Amazon A9 incorporates:

Purchase history affinity: if you buy a lot of Apple products, Apple items rank higher
Click history: products you clicked but not bought are re-ranked down
Price sensitivity: infer preferred price range from purchase history
Prime membership: Prime members see Prime-eligible products first
Geography: products with fast delivery to your location rank up

def personalize_ranking(products, user_profile):
    for product in products:
        personalization_score = 0.0

        # Brand affinity
        brand_affinity = user_profile.brand_affinity.get(product.brand, 0)
        personalization_score += brand_affinity * 0.3

        # Price fit score (Gaussian centered on user avg purchase price)
        price_fit = gaussian(product.price, user_profile.avg_purchase_price,
                             user_profile.price_stddev)
        personalization_score += price_fit * 0.2

        # Historical CTR for this product from similar users
        collaborative_score = cf_model.predict(user_profile.id, product.id)
        personalization_score += collaborative_score * 0.5

        # Blend with relevance score (70% relevance, 30% personalization)
        product.final_score = (0.7 * product.relevance_score +
                               0.3 * personalization_score)

    return sorted(products, key=lambda p: p.final_score, reverse=True)

Real-Time Inventory Integration

Out-of-stock items must not appear in search results (or be ranked very low). Inventory changes continuously — millions of items going in and out of stock daily.

Inventory update flow:
  Warehouse system → Kafka "inventory_updates" topic
                         ↓
               [Elasticsearch updater service]
                         ↓
       Partial document update: PATCH /products/{id}
       { "in_stock": false, "inventory_count": 0 }

  Lag: Kafka → ES update in < 5 seconds
  For flash sales: synchronous check at ranking time against Redis

Spelling Correction and Synonym Expansion

# Elasticsearch fuzziness for typos
# fuzziness: AUTO → 0 edits for 1-2 chars, 1 edit for 3-5, 2 edits for 6+

# Synonym dictionary (loaded into Elasticsearch analyzer)
synonyms:
  - "sofa, couch, loveseat"
  - "tv, television, smart tv"
  - "laptop, notebook, macbook"
  - "iphone charger, lightning cable, usb-c cable"

# Query rewriting for known misspellings
misspelling_map = {
    "airpord": "airpods",
    "blutooth": "bluetooth",
    "headfones": "headphones",
}

Index Update Pipeline

Seller lists new product or updates existing one
         ↓
[Product Catalog Service] → Kafka "product_updates"
         ↓
[Indexer Service]
  - Validates product data
  - Enriches: normalizes category, generates embeddings
  - Writes to Elasticsearch (near real-time, < 1s)
  - Triggers thumbnail processing pipeline
         ↓
Product searchable within 1-5 minutes of listing

Interview Discussion Points

How do you handle queries with no results? Query relaxation: drop lower-weight terms, expand synonyms, use phonetic search
How do you A/B test ranking changes? Shadow traffic, holdback groups, measure revenue per search, CTR, and purchase conversion
How do you prevent sellers from gaming the ranking? Detect review fraud (velocity, geographic clustering), penalize policy violations, verify purchase for reviews
What is the cold-start problem for new listings? No CTR or sales history — use content features (category, brand, price) + collaborative filtering from similar products

Frequently Asked Questions

How does Amazon product search rank results?

Amazon uses A9, a multi-signal ranking system. The base signals are: keyword relevance (BM25 match score across title, description, features), sales velocity (products selling more rank higher — implicit quality signal), customer reviews and ratings, price competitiveness, and Prime eligibility. Personalization adjusts rankings based on the user purchase history (brand affinity, price range preferences) and click history. Profit margin is also a factor — Amazon is a business. These signals are combined in a Learning to Rank model (LambdaMART or neural network) trained on historical click-through and purchase data. A/B testing continuously refines the ranking formula.

What is faceted search and how is it implemented?

Faceted search lets users filter results by multiple dimensions simultaneously (brand, price range, rating, category) and shows how many results match each option. Implementation uses Elasticsearch aggregations: alongside the main search query, you run term aggregations (count documents per brand, per rating bucket) and range aggregations (count documents in each price range). The tricky part is that each facet should show counts for results matching all OTHER active filters — for example, the brand facet shows how many results exist in each brand for the current price and rating filters, even though no brand filter is applied. This is called "post-filter" pattern in Elasticsearch.

How do you handle real-time inventory in product search?

Out-of-stock products should not appear prominently (or at all) in search results, but inventory changes constantly — millions of items going in and out of stock every day. The approach: inventory updates flow through a Kafka topic from the warehouse system to an Elasticsearch updater service, which applies partial document updates to set in_stock=false within 5-30 seconds. For flash sales or high-stakes inventory (last unit), synchronous inventory checks against a Redis cache happen at the final ranking stage, not in Elasticsearch, to avoid stale reads. Products with zero inventory are filtered out or ranked to the bottom using an Elasticsearch filter clause.

Companies That Ask This Question

Shopify Engineering Interview Guide

DoorDash Engineering Interview Guide