Requirements
- Return top-k (k=10) query suggestions as a user types, within 100ms
- Suggestions ranked by search frequency or relevance score
- Handle 10M daily active users, 1B queries/day
- Update suggestions as queries trend (near-real-time, not necessarily instant)
- Personalized suggestions (recent searches, user-specific boosting)
Trie-Based Architecture
The core data structure is a prefix trie where each node stores the top-k suggestions for that prefix. On each query, traverse from root using the typed prefix — the suggestions are pre-computed at each node, so lookup is O(L) where L is the prefix length.
class TrieNode:
def __init__(self):
self.children = {}
self.top_k = [] # pre-sorted list of (score, query) tuples
class SearchTrie:
def search(self, prefix):
node = self.root
for c in prefix:
if c not in node.children:
return []
node = node.children[c]
return [q for score, q in node.top_k]
Pre-computing top-k at every node means suggestion retrieval is O(L) — just traverse the prefix. The tradeoff: updates require recomputing top-k at every node along the path from root to the updated term (O(L * k) per update).
Data Flow
User types → Suggestion API → Redis (prefix → top-k list, TTL=10min)
→ Trie Service (if Redis miss)
Search query submitted → Kafka (query log)
→ Frequency Aggregator (Spark/Flink, hourly batch)
→ Trie Builder (rebuilds trie from aggregated frequencies)
→ Trie stored in Redis + replicated to Suggestion API servers
Redis Caching Layer
Store precomputed suggestions in Redis: key=suggest:{prefix}, value=JSON array of top-10 queries, TTL=10 minutes. The suggestion API checks Redis first (O(1) lookup). On miss: query the trie service. Cache hit rate is very high for common prefixes (short prefixes cover 80%+ of queries). For a prefix of length 1-3 characters, cache permanently (high frequency, low count). For longer prefixes, TTL=10min.
Trie Update Strategy
Option A — Real-time updates: on each completed search, update the query frequency and recompute top-k at each trie node along the path. O(L * k log k) per query. At 10K QPS, this is too expensive for a single trie instance. Option B — Periodic batch rebuild (recommended for scale): aggregate query frequencies over 1-hour windows using Spark. Rebuild the full trie from the aggregated data. Swap the trie (blue-green deploy): all suggestion API servers hot-reload the new trie from shared storage (S3). Suggestions lag by up to 1 hour but the trie is always consistent. For trending queries (last 10 minutes), apply a small real-time boost using a separate counter (Redis ZINCRBY on a sorted set of recent queries).
Personalization
Two layers: (1) Global suggestions from the trie (population-level frequency). (2) Personal layer: store each user’s recent searches in Redis (key=user_recents:{user_id}, sorted set by timestamp, max 20 entries). Merge: start with personal recent searches that match the prefix, then fill remaining slots from global suggestions. Personal results are fetched from Redis in O(log n + k), merged client-side or in the suggestion API.
Query Normalization
Before inserting into the trie: lowercase, trim whitespace, strip punctuation, handle spelling variants. Spell correction: maintain a dictionary of common misspellings → canonical forms. On suggestion miss for the prefix, try normalized/corrected prefix. Phonetic matching (Soundex, Metaphone) for fuzzy suggestions is expensive at scale — prefer exact prefix match + spell correction for speed.
Key Design Decisions
- Pre-compute top-k at each trie node — O(L) lookup, not O(L + n) search
- Redis prefix cache — eliminates trie traversal for 90%+ of requests
- Batch trie rebuild hourly — consistent data without per-query update overhead
- Real-time trending boost via Redis sorted set — catches viral queries within minutes
- Personal layer merged at serving time — no per-user trie, shared infrastructure
Google system design interviews frequently cover search autocomplete. See common questions for Google interview: search autocomplete and typeahead system design.
Amazon system design covers product search autocomplete. Review patterns for Amazon interview: search suggestions and product autocomplete design.
LinkedIn system design covers people search and autocomplete. See design patterns for LinkedIn interview: search autocomplete and people suggestion design.