Query Cache Low-Level Design
A query cache stores database query results in fast storage (typically Redis) so repeated identical queries can be served without hitting the database. Done correctly it reduces database load significantly for read-heavy workloads. Done incorrectly it serves stale data or makes invalidation impossible to reason about.
Cache Key Generation
The cache key must uniquely identify a query and its parameters. The generation process:
- Normalize the query string: lowercase, collapse whitespace, sort parameter names alphabetically
- Serialize bound parameter values deterministically
- If results are user-scoped or tenant-scoped: include
user_idortenant_idin the key material - SHA-256 hash the combined string → use as the Redis key
Normalization is critical: two queries that differ only in whitespace or parameter ordering should produce the same cache key and share a cache entry.
Cache Storage
Store results in Redis. The value is the serialized query result (list of rows). Use SET key value EX ttl_seconds. Choose serialization format based on access patterns: MessagePack or Protocol Buffers are more compact than JSON and faster to deserialize. For large result sets, compress before storing — gzip typically reduces JSON result size by 60–80%.
TTL Selection
TTL should match data freshness requirements, not be a uniform global setting:
- Reference/config data (countries, categories, settings): 1 hour
- Product catalog: 5 minutes
- Real-time inventory or pricing: 30 seconds
- Per-second metrics or live counts: no cache
Per-query TTL configuration is stored alongside the query definition or set by the calling code via a cache hint.
Cache Lookup Flow
- Compute cache key from query + parameters
Redis GET key- Hit: decompress and deserialize → return result; record cache hit metric
- Miss: execute query against database → serialize and compress result →
Redis SET key value EX ttl NX(NX prevents overwriting a value another thread just set) → return result; record cache miss metric
Event-Based Invalidation
TTL-based expiry alone causes stale reads for the TTL duration. Event-based invalidation removes cache entries immediately on write:
- Maintain a mapping from table name to the set of cache keys that include data from that table
- On any write to table
orders: delete all cache keys in thetable:ordersset - Implement with Redis sets:
SADD table:orders {key1}on cache write;SMEMBERS table:orders+DELon table write
Tag-Based Invalidation
For entity-level granularity, tag cache entries with entity identifiers. When order #123 is updated, invalidate only entries tagged order:123, not all order queries:
- On cache write:
SADD tag:order:123 {cache_key} - On update of order 123:
SMEMBERS tag:order:123→DELeach key →DEL tag:order:123
More targeted than table-level invalidation; preserves cache entries for unrelated orders.
Cache Bypass Cases
Never cache queries that contain non-deterministic functions:
NOW(),CURRENT_TIMESTAMP,RAND(),RANDOM()— different result on each call- Queries that are user-specific but the user_id was not included in the cache key
- Queries in a write transaction — the transaction may not be committed yet
Negative Caching
An empty result set is still a valid result. Cache [] (empty result) with a short TTL (30 seconds). Without negative caching, repeated requests for a non-existent entity hit the database on every request. A short TTL ensures freshness if the entity is later created.
Thundering Herd Prevention
When a popular cache key expires, many concurrent requests will all miss and query the database simultaneously. Prevent this with a distributed lock:
- On cache miss: attempt to acquire a Redis lock for the cache key (
SET lock:{key} 1 EX 5 NX) - If lock acquired: execute query, populate cache, release lock
- If lock not acquired: wait briefly and retry the cache GET — the first holder will have populated it
Cache Warming and Metrics
- Cache warming: on deploy, pre-populate cache with results for the most frequently requested queries (identified from query analytics). Avoids cold-start latency where the first minutes after deploy have low hit rates.
- Metrics: hit rate, miss rate, eviction rate (Redis memory pressure), cache latency by key pattern, invalidation count per table/tag — alert if hit rate drops significantly (may indicate an invalidation bug or query pattern change)
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety