Faceted Search System — Low-Level Design
A faceted search system lets users narrow results through multiple filter dimensions simultaneously — price range, brand, rating, color, availability. This design powers product search on Amazon, Airbnb listing filters, and LinkedIn people search. The challenge is computing accurate facet counts while applying other active filters.
The Facet Count Problem
User searches for "laptop" with filter: brand=Apple
Facet counts should show:
Price: Under $500 (0), $500-$1000 (12), Over $1000 (24)
RAM: 8GB (18), 16GB (14), 32GB (4)
In Stock: Yes (30), No (6)
The facet counts EXCLUDE the brand filter (Apple)
but INCLUDE all other active filters.
This is because showing "Apple: 36" while brand=Apple is selected is
meaningless — you're already filtering by Apple. Instead, show counts
for other brands to indicate what's available if you switch.
Elasticsearch Approach (Most Common)
# Elasticsearch: aggs (aggregations) compute facet counts in one query
def search_with_facets(query, filters, page=1, page_size=20):
active_filters = build_filter_clauses(filters)
body = {
'query': {
'bool': {
'must': [{'match': {'name': query}}] if query else [{'match_all': {}}],
'filter': active_filters,
}
},
'from': (page - 1) * page_size,
'size': page_size,
'aggs': {
# Global agg: brand facet counts WITHOUT the brand filter applied
'brand_facet': {
'filter': {'bool': {'filter': [f for f in active_filters if not is_brand_filter(f)]}},
'aggs': {'brands': {'terms': {'field': 'brand.keyword', 'size': 20}}}
},
# Regular agg: price range counts WITH all filters applied
'price_ranges': {
'range': {
'field': 'price_cents',
'ranges': [
{'to': 50000},
{'from': 50000, 'to': 100000},
{'from': 100000}
]
}
},
'in_stock': {'terms': {'field': 'in_stock'}},
'avg_rating': {'histogram': {'field': 'rating', 'interval': 1}},
}
}
return es.search(index='products', body=body)
PostgreSQL Approach (for Smaller Datasets)
def search_with_facets_pg(query, filters):
# Build WHERE clause from active filters
where_clauses = ['deleted_at IS NULL']
params = {}
if query:
where_clauses.append("to_tsvector(name || ' ' || description) @@ plainto_tsquery(%(q)s)")
params['q'] = query
if 'min_price' in filters:
where_clauses.append('price_cents >= %(min_price)s')
params['min_price'] = filters['min_price']
if 'brand' in filters:
where_clauses.append('brand = ANY(%(brands)s)')
params['brands'] = filters['brand']
base_where = ' AND '.join(where_clauses)
# Facet counts: each facet query drops that filter from the WHERE clause
brand_where = base_where.replace('brand = ANY(%(brands)s) AND ', '')
.replace(' AND brand = ANY(%(brands)s)', '')
brand_counts = db.execute(f"""
SELECT brand, COUNT(*) as cnt
FROM Product
WHERE {brand_where}
GROUP BY brand ORDER BY cnt DESC LIMIT 20
""", params)
results = db.execute(f"""
SELECT * FROM Product WHERE {base_where}
ORDER BY relevance DESC LIMIT 20
""", params)
return {'results': results, 'facets': {'brand': brand_counts}}
Indexing Strategy
# Elasticsearch mapping: facet fields must be 'keyword' type (not 'text')
# 'text' fields are analyzed (tokenized) and cannot be aggregated
# 'keyword' fields are exact-match and support terms aggregations
mapping = {
'mappings': {
'properties': {
'name': {'type': 'text', 'analyzer': 'english'},
'description': {'type': 'text'},
'brand': {'type': 'keyword'}, # facet
'category': {'type': 'keyword'}, # facet
'price_cents': {'type': 'integer'}, # range facet
'rating': {'type': 'float'}, # histogram facet
'in_stock': {'type': 'boolean'}, # term facet
'tags': {'type': 'keyword'}, # multi-value facet
'created_at': {'type': 'date'},
}
}
}
Keeping Search Index in Sync
def on_product_updated(product_id):
"""Called after any product write — price, stock, name changes."""
product = db.get(Product, product_id)
es.index(
index='products',
id=product_id,
body={
'name': product.name,
'brand': product.brand,
'price_cents': product.price_cents,
'in_stock': product.inventory_count > 0,
'rating': product.avg_rating,
'tags': product.tags,
'updated_at': now().isoformat(),
}
)
# For bulk sync (initial index or reindex):
# Read from DB in batches, use Elasticsearch bulk API
# Bulk API: 1000 documents per request, 10-50x faster than individual indexing
Key Interview Points
- Facet counts exclude their own filter: Brand facet counts must be computed without the brand filter applied. Otherwise the counts become meaningless once the user selects a brand. In Elasticsearch, use post_filter or a separate filter aggregation per facet.
- keyword vs text type in Elasticsearch: Text fields are tokenized and analyzed — you cannot do terms aggregations on them. Facet fields (brand, category, tags) must be keyword type or have a .keyword sub-field.
- Elasticsearch for scale, PostgreSQL for small datasets: PostgreSQL full-text search + aggregation queries work well up to ~1M products. Beyond that, Elasticsearch’s inverted index and parallel aggregation execution are significantly faster.
- Event-driven index updates: Write to Postgres first, then publish a Kafka event that triggers ES indexing. Never write directly to ES from the API — if ES is slow, it shouldn’t make your API slow.
}. The ^3 multiplies the title field’s score by 3.”}},{“@type”:”Question”,”name”:”How do you index 1 million products into Elasticsearch quickly?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use the Bulk API: POST /_bulk with batches of 500-1000 documents. Each batch is a single HTTP request. The bulk API is 10-50x faster than individual index calls. For the initial index: read products from PostgreSQL in batches using keyset pagination (WHERE id > last_id LIMIT 1000), transform each batch into Elasticsearch bulk format, and POST. Run with multiple parallel workers (3-5) to saturate ES indexing throughput. Disable ES refresh during the initial index (PUT /products/_settings {"refresh_interval": "-1"}), re-enable after completion — this speeds up indexing by 2-3x by reducing segment merging overhead.”}}]}
Faceted search and product filter system design is discussed in Amazon system design interview questions.
Faceted search and listing filter system design is covered in Airbnb system design interview preparation.
Faceted search and people search system design is discussed in LinkedIn system design interview guide.