Search Suggestion (Autocomplete) Low-Level Design: Trie, Redis, and Elasticsearch

Search suggestion (autocomplete) returns candidate completions for a partial query in under 100ms. The system must handle millions of prefix queries per second, rank results by relevance or popularity, and update suggestions as new content is indexed. Three architectures cover the design space: a trie in Redis for exact-prefix matching, Elasticsearch’s completion suggester for full-text matching with fuzzy support, and a precomputed prefix table for simple but scalable implementations.

Core Data Model (Prefix Table Approach)

-- Precompute all prefixes for each suggestion at index time
CREATE TABLE SearchSuggestion (
    prefix      VARCHAR(100) NOT NULL,
    suggestion  VARCHAR(255) NOT NULL,
    score       INT NOT NULL DEFAULT 0,   -- query frequency, popularity
    category    VARCHAR(50),              -- 'product', 'user', 'location'
    entity_id   BIGINT,                   -- reference to the actual entity
    PRIMARY KEY (prefix, suggestion)
);

CREATE INDEX idx_suggestion_prefix_score ON SearchSuggestion(prefix, score DESC);

-- Populate: for "apple iphone", insert rows for:
-- prefix="a" -> "apple iphone"
-- prefix="ap" -> "apple iphone"
-- prefix="app" -> "apple iphone"
-- ... up to the full query or max prefix length

-- Example population function:
def index_suggestion(text: str, score: int, category: str, entity_id: int):
    words = text.lower().split()
    prefixes = set()
    for i, word in enumerate(words):
        # Generate prefixes starting at each word position
        prefix = ""
        for char in (" ".join(words[i:])):
            prefix += char
            if len(prefix) >= 2:  # minimum prefix length
                prefixes.add(prefix)
    for prefix in prefixes:
        db.execute("""
            INSERT INTO SearchSuggestion (prefix, suggestion, score, category, entity_id)
            VALUES (%s, %s, %s, %s, %s)
            ON CONFLICT (prefix, suggestion) DO UPDATE
            SET score = EXCLUDED.score
        """, [prefix, text, score, category, entity_id])

Redis Sorted Set Approach (Production Standard)

def index_suggestion_redis(text: str, score: int):
    """Store all prefixes in Redis sorted sets, scored by popularity."""
    text_lower = text.lower()
    pipe = redis.pipeline()
    prefix = ""
    for char in text_lower:
        prefix += char
        if len(prefix) >= 2:
            # Sorted set key per prefix, score = popularity
            pipe.zadd(f"suggest:{prefix}", {text_lower: score})
            # Cap each prefix set at 20 entries to bound memory
            pipe.zremrangebyrank(f"suggest:{prefix}", 0, -21)
            pipe.expire(f"suggest:{prefix}", 86400)  # 24h TTL
    pipe.execute()

def get_suggestions(prefix: str, limit: int = 8) -> list[str]:
    key = f"suggest:{prefix.lower()}"
    # ZREVRANGE: highest score first
    results = redis.zrevrange(key, 0, limit - 1)
    return [r.decode() for r in results]

# Memory estimate:
# 1M unique suggestions, avg 20 chars, avg 10 prefixes each
# = 10M sorted set members at ~50 bytes each = 500MB Redis
# Acceptable for a dedicated suggestion Redis instance

Elasticsearch Completion Suggester

# Best for full-text search with fuzzy matching and multi-word suggestions
# Index mapping:
PUT /products
{
  "mappings": {
    "properties": {
      "suggest": {
        "type": "completion",
        "analyzer": "simple",
        "preserve_separators": true,
        "preserve_position_increments": true,
        "max_input_length": 50
      },
      "popularity": {"type": "integer"}
    }
  }
}

# Index a document:
POST /products/_doc/1
{
  "name": "Apple iPhone 15",
  "suggest": {
    "input": ["apple iphone 15", "iphone 15", "iphone"],
    "weight": 9850  # popularity score
  }
}

# Query for suggestions (Python):
def es_suggest(prefix: str) -> list[str]:
    result = es.search(index='products', body={
        "suggest": {
            "product_suggest": {
                "prefix": prefix,
                "completion": {
                    "field": "suggest",
                    "size": 8,
                    "fuzzy": {"fuzziness": 1}  # tolerate 1 typo
                }
            }
        }
    })
    options = result['suggest']['product_suggest'][0]['options']
    return [o['text'] for o in options]

Updating Scores from Search Analytics

def update_suggestion_scores():
    """Nightly job: recompute scores from last 7 days of search events."""
    popular = db.fetchall("""
        SELECT query, COUNT(*) as freq
        FROM SearchEvent
        WHERE occurred_at > NOW() - INTERVAL '7 days'
          AND result_count > 0  -- only queries that returned results
        GROUP BY query
        ORDER BY freq DESC
        LIMIT 100000
    """)

    pipe = redis.pipeline()
    for row in popular:
        query = row['query'].lower().strip()
        score = row['freq']
        prefix = ""
        for char in query:
            prefix += char
            if len(prefix) >= 2:
                pipe.zadd(f"suggest:{prefix}", {query: score}, xx=False)
                pipe.zremrangebyrank(f"suggest:{prefix}", 0, -21)
    pipe.execute()

Key Interview Points

Response time target is <50ms — Redis sorted sets deliver ~1ms; database prefix queries (LIKE ‘prefix%’) with a B-tree index deliver ~10-20ms for small tables, unacceptable at scale.
Cap the number of candidates per prefix (top 20 by score) — unbounded sorted sets waste memory and slow down ZREVRANGE as the set grows.
Dedup at query time: if a user types “ap”, the suggestions for “app” and “apple” both share the “ap” prefix set — no deduplication needed since they’re stored as distinct entries in the sorted set.
Personalization: blend global popularity scores with per-user search history. Store user-specific boosts separately and merge at query time: final_score = 0.7 * global_score + 0.3 * personal_score.
Cold start: populate suggestions from product/content catalog at launch, scored by number of views or purchases, before any user search data exists.
Minimum prefix length of 2 characters prevents explosion of single-character prefix sets (every word shares the same one-character prefix).

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why is a database LIKE query insufficient for real-time autocomplete?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”SELECT * FROM Product WHERE name LIKE ‘app%’ ORDER BY popularity DESC LIMIT 8. This query does use a B-tree index for prefix scans, but it has two problems at scale: (1) Latency — even with an index, a 10M-row table scan for "a%" returns millions of candidates before applying LIMIT. At 1,000 autocomplete requests/second, the DB is overwhelmed. (2) Flexibility — LIKE ‘%app%’ (infix match for "apple" when typing "pple") requires a full table scan (no index). Redis sorted sets solve both: O(log N) lookup regardless of dataset size, sub-millisecond response, and the prefix set pre-limits candidates to the top 20 by score. The database is only queried to enrich the top results with current metadata.”}},{“@type”:”Question”,”name”:”How do you keep autocomplete suggestions fresh as new content is created?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two update paths: (1) Real-time: when a new product, article, or user is created, immediately index it into Redis sorted sets. The background write takes ~5ms and does not block the creation flow. Set the initial score to a default (e.g., new product score = median popularity). (2) Batch score updates: nightly, query SearchEvent for the most popular queries and update Redis scores. High-frequency queries float to the top of each prefix set; stale queries sink and eventually get evicted by zremrangebyrank when the set exceeds the cap (top 20). (3) Deletion: when content is removed, delete all its prefix entries. This is O(max_prefix_length) Redis writes — acceptable for occasional deletes.”}},{“@type”:”Question”,”name”:”How do you add fuzzy matching to catch typos like "iphone" typed as "iphoen"?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The Redis sorted set approach only handles exact prefixes — "iphoen" will not match "iphone" entries. Two strategies: (1) Client-side deferral — after 300ms with no exact match, fall back to Elasticsearch’s completion suggester with fuzziness:1 (one-character edit distance). ES handles ~50-200ms queries. This two-tier approach keeps common queries fast (Redis) while covering typos (ES). (2) Phonetic indexing — compute a phonetic code (Metaphone, Soundex) for each word and index by phonetic prefix in addition to literal prefix. "iphoen" and "iphone" share the same phonetic code and match the same prefix entries. Complex to implement; use ES fuzzy as the pragmatic solution.”}},{“@type”:”Question”,”name”:”How do you personalize suggestions based on a user’s search history?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Global suggestions use popularity scores. Personalized suggestions blend global with personal: at query time, fetch both global suggestions (from Redis) and user-specific suggestions (from a user-scoped sorted set: suggest:{user_id}:{prefix}). Merge the two lists: personal_score = 0.7 * global_score + 0.3 * user_history_boost. Surface user-boosted entries first, then global entries not already shown. The user-specific sorted sets are populated by recording each completed search: when a user submits query "running shoes", increment their personal score for all prefixes of that query. Personal sets need a shorter TTL (7 days) since user interests evolve. Limit personal set size to top 50 per prefix. This adds one additional Redis lookup per keystroke — typically 0.5ms, acceptable.”}},{“@type”:”Question”,”name”:”How much memory does a Redis autocomplete index require?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Estimate: assume 1 million unique suggestions, average 20 characters each, average 10 prefix lengths per suggestion (minimum 2 chars up to ~12 chars). Each sorted set entry: member string (avg 20 bytes) + score (8 bytes) + Redis overhead (~70 bytes) ≈ 100 bytes. 1M suggestions × 10 prefixes × 100 bytes = 1GB Redis memory. In practice: cap each prefix set at 20 entries (zremrangebyrank) to bound memory — this reduces actual memory significantly since most prefixes share the same top-20 entries. Real-world systems at this scale use ~200-500MB for the suggestion index. A dedicated Redis instance with 1-2GB RAM is sufficient. Compress by lowercasing and deduplicating the member strings before storing.”}}]}

Search autocomplete and suggestion system design is discussed in Google system design interview questions.

Search suggestion and product autocomplete design is covered in Amazon system design interview preparation.

Search suggestion and people search autocomplete design is discussed in LinkedIn system design interview guide.