What is a Search History System?
A search history system records a user’s past searches, enables autocomplete from personal history, and supports search history management (view, delete individual entries, clear all). Google Search, Spotify, Amazon, and YouTube all surface personal search history for faster re-search and personalization. The system must balance fast writes (every search is recorded), fast reads (history is shown instantly in the search box), privacy (users can delete their history), and storage efficiency (years of history per user).
Requirements
- Record every search query per user (query text, timestamp, result clicked)
- Autocomplete from personal history: as the user types, show matching past searches
- Show recent search history (last 20 searches) in the search dropdown
- Delete individual history entries or clear all history
- Privacy: never show one user’s history to another
- Retention: keep last 1000 searches per user; older entries expire
Data Model
SearchHistoryEntry(
entry_id UUID PRIMARY KEY,
user_id UUID NOT NULL,
query VARCHAR(500) NOT NULL,
searched_at TIMESTAMPTZ NOT NULL,
result_clicked VARCHAR, -- URL or item_id of clicked result (nullable)
deleted BOOL DEFAULT false,
INDEX (user_id, searched_at DESC)
)
Write Path
Every search is an async write — don’t block the search response waiting for history to persist:
def search(user_id, query):
# 1. Execute search (primary operation)
results = search_engine.query(query)
# 2. Record history asynchronously (fire and forget)
task_queue.enqueue(record_search_history, user_id=user_id, query=query,
delay=0) # async, non-blocking
return results
def record_search_history(user_id, query):
# Deduplicate: if user searched the same query recently, update timestamp
existing = db.query('''
SELECT entry_id FROM SearchHistoryEntry
WHERE user_id=:uid AND query=:q AND deleted=false
AND searched_at > NOW() - INTERVAL '7 days'
LIMIT 1
''', uid=user_id, q=query)
if existing:
db.execute('UPDATE SearchHistoryEntry SET searched_at=NOW() WHERE entry_id=?',
existing[0].entry_id)
else:
db.insert(SearchHistoryEntry(user_id=user_id, query=query, searched_at=now()))
# Enforce retention limit: delete oldest entries beyond 1000
db.execute('''
DELETE FROM SearchHistoryEntry
WHERE user_id=:uid AND entry_id NOT IN (
SELECT entry_id FROM SearchHistoryEntry
WHERE user_id=:uid AND deleted=false
ORDER BY searched_at DESC LIMIT 1000
)
''', uid=user_id)
Read Path: Recent History and Autocomplete
def get_recent_history(user_id, limit=20):
# Cache in Redis: list of (query, timestamp) tuples
key = f'search_history:{user_id}'
cached = redis.lrange(key, 0, limit-1)
if cached:
return [json.loads(e) for e in cached]
entries = db.query('''
SELECT query, searched_at FROM SearchHistoryEntry
WHERE user_id=:uid AND deleted=false
ORDER BY searched_at DESC LIMIT :limit
''', uid=user_id, limit=limit)
# Cache for 5 minutes
pipe = redis.pipeline()
pipe.delete(key)
for e in entries:
pipe.rpush(key, json.dumps({'query': e.query, 'ts': e.searched_at.isoformat()}))
pipe.expire(key, 300)
pipe.execute()
return entries
def autocomplete_from_history(user_id, prefix):
# Simple prefix match from recent history
history = get_recent_history(user_id, limit=100)
return [e['query'] for e in history
if e['query'].lower().startswith(prefix.lower())][:5]
Delete Operations
def delete_entry(user_id, entry_id):
# Soft delete — keep for analytics, hide from user
db.execute('''
UPDATE SearchHistoryEntry SET deleted=true
WHERE entry_id=:eid AND user_id=:uid
''', eid=entry_id, uid=user_id)
redis.delete(f'search_history:{user_id}') # invalidate cache
def clear_all_history(user_id):
db.execute('''
UPDATE SearchHistoryEntry SET deleted=true
WHERE user_id=:uid AND deleted=false
''', uid=user_id)
redis.delete(f'search_history:{user_id}')
Key Design Decisions
- Async history writes — search latency must not be affected by history recording; fire and forget
- Query deduplication — update timestamp instead of inserting duplicate for recently repeated searches
- Soft delete — audit trail and ability to undo accidental deletion; hard delete via periodic cleanup job
- Redis list cache — recent history read on every keystroke; DB query on first miss only
- Retention enforced at write time — delete beyond 1000 on insert; avoids unbounded growth per user
Search history and autocomplete system design is discussed in Google system design interview questions.
Search history and personalized search design is covered in Amazon system design interview preparation.
Search history and user activity tracking design is covered in LinkedIn system design interview guide.