Product search at Amazon or eBay is one of the most complex search systems in the world. It must combine keyword relevance, faceted filtering (price, category, brand, rating), personalized ranking, real-time inventory, and autocomplete — all in under 200ms for billions of products and hundreds of millions of users.
Functional Requirements
- Full-text search across product title, description, brand, category
- Faceted filtering: price range, category, brand, rating, Prime eligible, seller
- Ranking: relevance, customer reviews, sales velocity, profit margin, personalization
- Autocomplete: real-time query suggestions as user types
- Spelling correction: “iphone chargar” → “iphone charger”
- Synonym expansion: “sofa” → also searches “couch”, “loveseat”
Scale
| Metric | Target |
|---|---|
| Products indexed | 350+ million (Amazon) |
| Search QPS peak | ~100,000/s |
| End-to-end latency | < 200ms P99 |
| Index update lag | < 5 minutes for new listings |
Search Index Architecture
Document Structure
// Product document in Elasticsearch
{
"product_id": "B08F5V7T9X",
"title": "Apple AirPods Pro 2nd Generation with MagSafe",
"brand": "Apple",
"category_path": ["Electronics", "Headphones", "Earbuds"],
"price": 249.00,
"rating": 4.7,
"review_count": 85234,
"is_prime": true,
"in_stock": true,
"sales_rank_category": 3, // rank in category by sales
"description": "...",
"features": ["Active Noise Cancellation", "Transparency Mode", ...],
"embeddings": [0.23, -0.11, ...], // 768-dim vector for semantic search
"seller_id": "A1EXMPLE",
"created_at": "2023-09-12T00:00:00Z"
}
Two-Phase Retrieval and Ranking
Phase 1 — RETRIEVAL (fast, broad):
Input: user query "wireless earbuds noise cancelling"
BM25 retrieval:
Inverted index lookup for each query term
Score = TF-IDF weighted match across title, features, description
→ top 10,000 candidates in < 20ms
Semantic retrieval (dense):
Encode query → 768-dim embedding
ANN search (FAISS/ScaNN) for semantically similar products
→ top 2,000 candidates in < 15ms
Merge and deduplicate: ~12,000 candidates
Phase 2 — RANKING (expensive model, small candidate set):
LambdaMART or neural LTR model
Features: BM25 score, semantic score, sales rank, rating, price,
click-through rate history, user profile affinity,
query-product historical CTR
→ top 48 products ranked in < 30ms
Elasticsearch Query for Product Search
GET /products/_search
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "wireless earbuds noise cancelling",
"fields": ["title^3", "brand^2", "features^2", "description"],
"type": "best_fields",
"fuzziness": "AUTO" // handles typos
}
},
"filter": [
{ "term": { "in_stock": true } },
{ "range": { "price": { "gte": 50, "lte": 500 } } },
{ "term": { "is_prime": true } }
]
}
},
"aggs": {
"by_brand": { "terms": { "field": "brand", "size": 20 } },
"by_rating": { "range": { "field": "rating",
"ranges": [{"from": 4}, {"from": 3, "to": 4}] } },
"price_stats":{ "stats": { "field": "price" } }
},
"sort": [
{ "_score": "desc" },
{ "sales_rank_category": "asc" }
],
"size": 48
}
Faceted Filtering
The aggregations in the query above compute facet counts — how many products match each filter option. Facets update dynamically as the user applies filters, so all filters must apply to both results and facet counts except the facet being counted (to show how many results each option would yield).
Autocomplete and Query Suggestions
# Autocomplete powered by:
# 1. Query logs: most frequent queries starting with the typed prefix
# 2. Elasticsearch completion suggester (FST-based, very fast)
# 3. Query-to-category mapping for rich suggestions
# Elasticsearch completion suggester
PUT /search_suggest
{
"mappings": {
"properties": {
"query_text": {
"type": "completion",
"analyzer": "simple"
},
"weight": { "type": "integer" } // search frequency score
}
}
}
# Suggest endpoint (returns in < 5ms)
GET /search_suggest/_search
{
"suggest": {
"query_suggest": {
"prefix": "wireless ear",
"completion": {
"field": "query_text",
"size": 10,
"fuzzy": { "fuzziness": 1 }
}
}
}
}
# Returns: ["wireless earbuds", "wireless earphones", "wireless earbud case", ...]
Personalized Ranking (Amazon A9)
The same search query returns different results for different users. Amazon A9 incorporates:
- Purchase history affinity: if you buy a lot of Apple products, Apple items rank higher
- Click history: products you clicked but not bought are re-ranked down
- Price sensitivity: infer preferred price range from purchase history
- Prime membership: Prime members see Prime-eligible products first
- Geography: products with fast delivery to your location rank up
def personalize_ranking(products, user_profile):
for product in products:
personalization_score = 0.0
# Brand affinity
brand_affinity = user_profile.brand_affinity.get(product.brand, 0)
personalization_score += brand_affinity * 0.3
# Price fit score (Gaussian centered on user avg purchase price)
price_fit = gaussian(product.price, user_profile.avg_purchase_price,
user_profile.price_stddev)
personalization_score += price_fit * 0.2
# Historical CTR for this product from similar users
collaborative_score = cf_model.predict(user_profile.id, product.id)
personalization_score += collaborative_score * 0.5
# Blend with relevance score (70% relevance, 30% personalization)
product.final_score = (0.7 * product.relevance_score +
0.3 * personalization_score)
return sorted(products, key=lambda p: p.final_score, reverse=True)
Real-Time Inventory Integration
Out-of-stock items must not appear in search results (or be ranked very low). Inventory changes continuously — millions of items going in and out of stock daily.
Inventory update flow:
Warehouse system → Kafka "inventory_updates" topic
↓
[Elasticsearch updater service]
↓
Partial document update: PATCH /products/{id}
{ "in_stock": false, "inventory_count": 0 }
Lag: Kafka → ES update in < 5 seconds
For flash sales: synchronous check at ranking time against Redis
Spelling Correction and Synonym Expansion
# Elasticsearch fuzziness for typos
# fuzziness: AUTO → 0 edits for 1-2 chars, 1 edit for 3-5, 2 edits for 6+
# Synonym dictionary (loaded into Elasticsearch analyzer)
synonyms:
- "sofa, couch, loveseat"
- "tv, television, smart tv"
- "laptop, notebook, macbook"
- "iphone charger, lightning cable, usb-c cable"
# Query rewriting for known misspellings
misspelling_map = {
"airpord": "airpods",
"blutooth": "bluetooth",
"headfones": "headphones",
}
Index Update Pipeline
Seller lists new product or updates existing one
↓
[Product Catalog Service] → Kafka "product_updates"
↓
[Indexer Service]
- Validates product data
- Enriches: normalizes category, generates embeddings
- Writes to Elasticsearch (near real-time, < 1s)
- Triggers thumbnail processing pipeline
↓
Product searchable within 1-5 minutes of listing
Interview Discussion Points
- How do you handle queries with no results? Query relaxation: drop lower-weight terms, expand synonyms, use phonetic search
- How do you A/B test ranking changes? Shadow traffic, holdback groups, measure revenue per search, CTR, and purchase conversion
- How do you prevent sellers from gaming the ranking? Detect review fraud (velocity, geographic clustering), penalize policy violations, verify purchase for reviews
- What is the cold-start problem for new listings? No CTR or sales history — use content features (category, brand, price) + collaborative filtering from similar products
Frequently Asked Questions
How does Amazon product search rank results?
Amazon uses A9, a multi-signal ranking system. The base signals are: keyword relevance (BM25 match score across title, description, features), sales velocity (products selling more rank higher — implicit quality signal), customer reviews and ratings, price competitiveness, and Prime eligibility. Personalization adjusts rankings based on the user purchase history (brand affinity, price range preferences) and click history. Profit margin is also a factor — Amazon is a business. These signals are combined in a Learning to Rank model (LambdaMART or neural network) trained on historical click-through and purchase data. A/B testing continuously refines the ranking formula.
What is faceted search and how is it implemented?
Faceted search lets users filter results by multiple dimensions simultaneously (brand, price range, rating, category) and shows how many results match each option. Implementation uses Elasticsearch aggregations: alongside the main search query, you run term aggregations (count documents per brand, per rating bucket) and range aggregations (count documents in each price range). The tricky part is that each facet should show counts for results matching all OTHER active filters — for example, the brand facet shows how many results exist in each brand for the current price and rating filters, even though no brand filter is applied. This is called "post-filter" pattern in Elasticsearch.
How do you handle real-time inventory in product search?
Out-of-stock products should not appear prominently (or at all) in search results, but inventory changes constantly — millions of items going in and out of stock every day. The approach: inventory updates flow through a Kafka topic from the warehouse system to an Elasticsearch updater service, which applies partial document updates to set in_stock=false within 5-30 seconds. For flash sales or high-stakes inventory (last unit), synchronous inventory checks against a Redis cache happen at the final ranking stage, not in Elasticsearch, to avoid stale reads. Products with zero inventory are filtered out or ranked to the bottom using an Elasticsearch filter clause.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does Amazon product search rank results?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Amazon uses A9, a multi-signal ranking system. The base signals are: keyword relevance (BM25 match score across title, description, features), sales velocity (products selling more rank higher — implicit quality signal), customer reviews and ratings, price competitiveness, and Prime eligibility. Personalization adjusts rankings based on the user purchase history (brand affinity, price range preferences) and click history. Profit margin is also a factor — Amazon is a business. These signals are combined in a Learning to Rank model (LambdaMART or neural network) trained on historical click-through and purchase data. A/B testing continuously refines the ranking formula.”
}
},
{
“@type”: “Question”,
“name”: “What is faceted search and how is it implemented?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Faceted search lets users filter results by multiple dimensions simultaneously (brand, price range, rating, category) and shows how many results match each option. Implementation uses Elasticsearch aggregations: alongside the main search query, you run term aggregations (count documents per brand, per rating bucket) and range aggregations (count documents in each price range). The tricky part is that each facet should show counts for results matching all OTHER active filters — for example, the brand facet shows how many results exist in each brand for the current price and rating filters, even though no brand filter is applied. This is called “post-filter” pattern in Elasticsearch.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle real-time inventory in product search?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Out-of-stock products should not appear prominently (or at all) in search results, but inventory changes constantly — millions of items going in and out of stock every day. The approach: inventory updates flow through a Kafka topic from the warehouse system to an Elasticsearch updater service, which applies partial document updates to set in_stock=false within 5-30 seconds. For flash sales or high-stakes inventory (last unit), synchronous inventory checks against a Redis cache happen at the final ranking stage, not in Elasticsearch, to avoid stale reads. Products with zero inventory are filtered out or ranked to the bottom using an Elasticsearch filter clause.”
}
}
]
}