What Is a Read-Through Cache?
In a read-through cache, the cache layer itself is responsible for loading data from the database on a miss. The application calls cache.get(key) and always receives a value — the cache either returns a cached entry or transparently fetches it from the DB, populates the cache, and returns it. The application has no explicit cache-miss handling logic.
This is in contrast to a cache-aside (lazy loading) pattern, where the application checks the cache, handles a miss itself by querying the DB, and manually populates the cache. Read-through centralizes that logic in the cache layer.
Cache Loader: Pluggable Population
A cache loader is a function registered per key pattern that the cache calls on a miss. For example:
- Key pattern
user:*→ loader queries theuserstable by user ID. - Key pattern
product:*→ loader queries theproductstable.
Loaders are registered at startup. The cache calls loader(key) when the pattern matches a missed key, populates the cache with the result, and returns the value to the caller — all transparently.
Stale-While-Revalidate
Stale-while-revalidate (SWR) is a technique to eliminate cache-miss latency at the cost of brief staleness. Instead of blocking the caller while the cache refreshes an expired entry:
- Serve the stale value immediately.
- Trigger an async background refresh to reload from DB.
- The next request (after refresh completes) gets the fresh value.
SWR requires two TTL values per entry: a fresh TTL (serve without refresh) and a stale TTL (serve stale while refreshing). Once the stale TTL expires, the entry is considered fully expired and the next request blocks on a synchronous load.
Versioned Cache Entries
For stronger consistency, entries carry a version_id sourced from the DB row (e.g., a row-level version counter or updated_at timestamp). On a write to the DB:
- The DB increments the version on the row.
- The write path sends an invalidation message (or updates the cache directly) with the new version.
- On next read, the cache compares its stored version against the expected version. A version mismatch triggers a synchronous reload.
This gives fine-grained invalidation without a blanket TTL-based expiry.
Consistent Hashing Across Cache Nodes
In a distributed read-through cache cluster, consistent hashing ensures that a given key always maps to the same cache node. This is critical for read-through correctness: if different nodes could serve the same key, they would each maintain separate loader state and potentially issue redundant DB queries. With consistent hashing, a cache miss is always handled by the same authoritative node for that key.
Background Pre-Fetch
A pre-fetch job scans the cache for entries approaching their TTL expiry and proactively refreshes them before they expire. This eliminates the latency spike that would occur if many entries expired simultaneously (thundering herd). Pre-fetch candidates are identified by querying access frequency — only frequently accessed entries are worth the pre-fetch cost.
SQL Schema
CREATE TABLE CacheEntry (
cache_key TEXT PRIMARY KEY,
value JSONB NOT NULL,
version_id BIGINT,
loaded_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_cache_expiry ON CacheEntry (expires_at);
CREATE TABLE CacheLoader (
key_pattern TEXT PRIMARY KEY,
loader_type TEXT NOT NULL,
ttl_seconds INT NOT NULL,
stale_ttl_seconds INT NOT NULL
);
Python Implementation Sketch
import time, threading
from typing import Callable, Optional
class ReadThroughCache:
def __init__(self, db):
self.db = db
self.store: dict[str, dict] = {}
self.loaders: dict[str, Callable] = {}
self.lock = threading.Lock()
def register_loader(self, key_pattern: str, loader_fn: Callable, ttl: int, stale_ttl: int):
self.loaders[key_pattern] = {'fn': loader_fn, 'ttl': ttl, 'stale_ttl': stale_ttl}
def get(self, key: str) -> Optional[dict]:
with self.lock:
entry = self.store.get(key)
now = time.time()
if entry:
if now < entry['expires_at']:
return entry['value']
elif now Optional[dict]:
loader_cfg = self._find_loader(key)
if not loader_cfg:
return None
value = loader_cfg['fn'](key)
if value is None:
return None
now = time.time()
entry = {
'value': value,
'expires_at': now + loader_cfg['ttl'],
'stale_expires_at': now + loader_cfg['ttl'] + loader_cfg['stale_ttl'],
}
with self.lock:
self.store[key] = entry
return value
def stale_while_revalidate(self, key: str):
self.load_from_db(key)
def invalidate_version(self, key: str, new_version: int):
with self.lock:
entry = self.store.get(key)
if entry and entry.get('version_id', -1) < new_version:
del self.store[key]
def _find_loader(self, key: str):
import fnmatch
for pattern, cfg in self.loaders.items():
if fnmatch.fnmatch(key, pattern):
return cfg
return None
Consistency with Write Operations
Read-through caches must be paired with a consistent write strategy. Options:
- Write-through: update cache and DB synchronously on every write — cache always current.
- Write-invalidate: on write, delete the cache entry; next read reloads from DB via the loader.
- Version-based invalidation: as described above, propagate new version to trigger stale detection.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the staleness window in stale-while-revalidate?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The staleness window is the period between when a cache entry's fresh TTL expires and when its stale TTL expires. During this window, the stale value is served immediately and an async refresh is triggered. The staleness window duration equals stale_ttl minus fresh_ttl. After both TTLs expire, the next request blocks on a synchronous DB load.”
}
},
{
“@type”: “Question”,
“name”: “What happens if the cache loader fails on a miss?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “If the loader function throws an exception or returns null, the cache should not populate an entry and should propagate the error to the caller. Optionally, a negative cache entry (null value with a short TTL) can be stored to prevent a thundering herd of repeated DB queries for a missing key. Retries should use exponential backoff.”
}
},
{
“@type”: “Question”,
“name”: “How does version-based invalidation work in a read-through cache?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each cache entry stores the version_id of the DB row at load time. When the DB row is updated, the new version is broadcast (via pub/sub or direct cache call). The cache checks if its stored version is stale and evicts the entry if so. The next read triggers the loader to fetch the current row. This avoids TTL-based latency while maintaining consistency.”
}
},
{
“@type”: “Question”,
“name”: “Why is consistent hashing important for a read-through cache cluster?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Consistent hashing ensures a given key always routes to the same cache node. Without it, multiple nodes might each handle a miss for the same key independently, causing N parallel DB queries and N separate cached copies that can diverge. With consistent hashing, each key has exactly one authoritative node that owns its loader state, preventing redundant loads and ensuring a single cached copy.”
}
}
]
}
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does read-through cache populate on a miss?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “On a cache miss the cache itself (not the application) transparently fetches the value from the backing store, stores it under the requested key with a configured TTL, and returns it to the caller. This keeps cache-population logic centralized in the cache layer rather than scattered across application code.”
}
},
{
“@type”: “Question”,
“name”: “How is stale-while-revalidate implemented?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The cache returns the stale (expired) entry immediately to the caller while triggering an asynchronous background refresh against the backing store, so the next request receives a fresh value with no added latency. A per-key lock or CAS flag ensures only one background refresh is in flight at a time, preventing redundant revalidation requests.”
}
},
{
“@type”: “Question”,
“name”: “How are thundering herds prevented on cache miss?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A per-key mutex or probabilistic early expiration (PER) pattern ensures that only the first request triggers a backing-store fetch while concurrent requests for the same key block or receive the stale value. Distributed cache systems use a lock token stored atomically (e.g., via Redis SET NX) so only one node performs the fetch across the entire fleet.”
}
},
{
“@type”: “Question”,
“name”: “How does read-through handle TTL expiry?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “When the TTL of a cached entry expires, the next read triggers a synchronous or asynchronous reload from the backing store using the same read-through path as a cold miss. To avoid latency spikes at expiry boundaries, TTL values are often jittered (randomized within a range) so cache entries for different keys expire at staggered times rather than all at once.”
}
}
]
}
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety