Search History System Low-Level Design

What is a Search History System?

A search history system records a user’s past searches, enables autocomplete from personal history, and supports search history management (view, delete individual entries, clear all). Google Search, Spotify, Amazon, and YouTube all surface personal search history for faster re-search and personalization. The system must balance fast writes (every search is recorded), fast reads (history is shown instantly in the search box), privacy (users can delete their history), and storage efficiency (years of history per user).

Requirements

  • Record every search query per user (query text, timestamp, result clicked)
  • Autocomplete from personal history: as the user types, show matching past searches
  • Show recent search history (last 20 searches) in the search dropdown
  • Delete individual history entries or clear all history
  • Privacy: never show one user’s history to another
  • Retention: keep last 1000 searches per user; older entries expire

Data Model

SearchHistoryEntry(
    entry_id    UUID PRIMARY KEY,
    user_id     UUID NOT NULL,
    query       VARCHAR(500) NOT NULL,
    searched_at TIMESTAMPTZ NOT NULL,
    result_clicked VARCHAR,     -- URL or item_id of clicked result (nullable)
    deleted     BOOL DEFAULT false,
    INDEX (user_id, searched_at DESC)
)

Write Path

Every search is an async write — don’t block the search response waiting for history to persist:

def search(user_id, query):
    # 1. Execute search (primary operation)
    results = search_engine.query(query)

    # 2. Record history asynchronously (fire and forget)
    task_queue.enqueue(record_search_history, user_id=user_id, query=query,
                       delay=0)  # async, non-blocking

    return results

def record_search_history(user_id, query):
    # Deduplicate: if user searched the same query recently, update timestamp
    existing = db.query('''
        SELECT entry_id FROM SearchHistoryEntry
        WHERE user_id=:uid AND query=:q AND deleted=false
        AND searched_at > NOW() - INTERVAL '7 days'
        LIMIT 1
    ''', uid=user_id, q=query)

    if existing:
        db.execute('UPDATE SearchHistoryEntry SET searched_at=NOW() WHERE entry_id=?',
                   existing[0].entry_id)
    else:
        db.insert(SearchHistoryEntry(user_id=user_id, query=query, searched_at=now()))

    # Enforce retention limit: delete oldest entries beyond 1000
    db.execute('''
        DELETE FROM SearchHistoryEntry
        WHERE user_id=:uid AND entry_id NOT IN (
            SELECT entry_id FROM SearchHistoryEntry
            WHERE user_id=:uid AND deleted=false
            ORDER BY searched_at DESC LIMIT 1000
        )
    ''', uid=user_id)

Read Path: Recent History and Autocomplete

def get_recent_history(user_id, limit=20):
    # Cache in Redis: list of (query, timestamp) tuples
    key = f'search_history:{user_id}'
    cached = redis.lrange(key, 0, limit-1)
    if cached:
        return [json.loads(e) for e in cached]

    entries = db.query('''
        SELECT query, searched_at FROM SearchHistoryEntry
        WHERE user_id=:uid AND deleted=false
        ORDER BY searched_at DESC LIMIT :limit
    ''', uid=user_id, limit=limit)

    # Cache for 5 minutes
    pipe = redis.pipeline()
    pipe.delete(key)
    for e in entries:
        pipe.rpush(key, json.dumps({'query': e.query, 'ts': e.searched_at.isoformat()}))
    pipe.expire(key, 300)
    pipe.execute()
    return entries

def autocomplete_from_history(user_id, prefix):
    # Simple prefix match from recent history
    history = get_recent_history(user_id, limit=100)
    return [e['query'] for e in history
            if e['query'].lower().startswith(prefix.lower())][:5]

Delete Operations

def delete_entry(user_id, entry_id):
    # Soft delete — keep for analytics, hide from user
    db.execute('''
        UPDATE SearchHistoryEntry SET deleted=true
        WHERE entry_id=:eid AND user_id=:uid
    ''', eid=entry_id, uid=user_id)
    redis.delete(f'search_history:{user_id}')  # invalidate cache

def clear_all_history(user_id):
    db.execute('''
        UPDATE SearchHistoryEntry SET deleted=true
        WHERE user_id=:uid AND deleted=false
    ''', uid=user_id)
    redis.delete(f'search_history:{user_id}')

Key Design Decisions

  • Async history writes — search latency must not be affected by history recording; fire and forget
  • Query deduplication — update timestamp instead of inserting duplicate for recently repeated searches
  • Soft delete — audit trail and ability to undo accidental deletion; hard delete via periodic cleanup job
  • Redis list cache — recent history read on every keystroke; DB query on first miss only
  • Retention enforced at write time — delete beyond 1000 on insert; avoids unbounded growth per user

Search history and autocomplete system design is discussed in Google system design interview questions.

Search history and personalized search design is covered in Amazon system design interview preparation.

Search history and user activity tracking design is covered in LinkedIn system design interview guide.

Scroll to Top