Low Level Design: LRU Cache

Core Data Structure

An LRU cache combines two data structures to achieve O(1) time for every operation. The doubly linked list maintains access order: the most-recently-used node lives at the head, the least-recently-used node lives at the tail. When an item is accessed it gets unlinked from its current position and reinserted at the head. When eviction is needed the tail node is dropped in O(1) time.

The hashmap maps each key directly to its node pointer in the linked list. Without the hashmap, finding a node in the list would require O(N) traversal. With it, every lookup is O(1). The two structures together deliver O(1) get, O(1) put, and O(1) eviction.

Pseudocode for the node structure:

class Node:
    key, value
    prev: Node
    next: Node

class LRUCache:
    capacity: int
    map: HashMap<key, Node>
    head: Node  // dummy head (MRU side)
    tail: Node  // dummy tail (LRU side)

Dummy sentinel nodes at head and tail eliminate edge-case null checks during insert and remove operations.

Get Operation

The get path is straightforward:

  1. Look up the key in the hashmap. If absent, return -1.
  2. If found, unlink the node from its current position in the list.
  3. Reinsert the node immediately after the dummy head (MRU position).
  4. Return the node’s value.

Unlinking a doubly linked node requires updating four pointers (node.prev.next, node.next.prev, head.next, new_second.prev) — all constant time. Total complexity: O(1).

def get(key):
    if key not in map: return -1
    node = map[key]
    move_to_head(node)
    return node.value

Put Operation

Put covers two cases:

Key already exists: Update the node’s value in place, then move the node to the head (same as the get path). No capacity change.

Key is new:

  1. If the cache is at capacity, evict the tail node: unlink it from the list and delete its entry from the hashmap.
  2. Create a new node, insert it at the head, and add it to the hashmap.
def put(key, value):
    if key in map:
        map[key].value = value
        move_to_head(map[key])
    else:
        if len(map) == capacity:
            lru = tail.prev       # node just before dummy tail
            remove_node(lru)
            del map[lru.key]
        node = Node(key, value)
        insert_at_head(node)
        map[key] = node

Every step is pointer manipulation or a hashmap operation — all O(1).

Eviction Policy

LRU evicts the tail node, which is always the least recently accessed entry. Eviction is triggered the moment a put operation would exceed capacity.

Key design decisions around eviction:

  • Eviction order: Strictly LRU — the node that was accessed least recently is always removed first. No priority weighting by frequency.
  • Eviction callback: A common extension is an onEvict(key, value) callback invoked when a node is dropped. In a cache-aside pattern this callback can asynchronously reload the value from a backing store, keeping the cache warm.
  • Scan pollution: LRU is vulnerable to sequential scans. A single full-scan access pattern evicts all warm entries. This is addressed by LRU-K or segmented LRU variants (see below).

Thread Safety

A single-threaded LRU cache is straightforward. Concurrent access requires synchronization because get is not read-only — it mutates the list order.

Coarse-grained mutex: A single ReentrantLock or synchronized block wraps every get and put. Simple and correct, but serializes all access. Suitable for moderate concurrency.

Read-write lock: Allows multiple concurrent readers on a true read path, but since LRU get mutates list order this only helps if you separate a "peek" operation (read without promotion) from a normal get.

Lock-free approaches: Use ConcurrentHashMap for the key lookup layer and synchronize only the linked list manipulation. This reduces contention but complicates correctness significantly.

Java LinkedHashMap shortcut: new LinkedHashMap(capacity, 0.75f, true) with accessOrder=true and an overridden removeEldestEntry gives a single-threaded LRU in ~5 lines. Wrap with Collections.synchronizedMap for thread safety, though this is coarse-grained.

For high-throughput systems, consider a segmented cache: divide the keyspace into N independent shards, each with its own lock. Contention drops by a factor of N.

Capacity and Memory Management

Capacity is set at construction and is immutable. Memory footprint per node:

  • Key size (varies by type)
  • Value size (varies by type)
  • Two pointers: prev and next (8 bytes each on 64-bit JVM)
  • Hashmap entry overhead (~32 bytes on JVM)

Total cache memory ≈ capacity * (key_size + value_size + pointer_overhead). For a cache of 10,000 entries holding 1 KB values, expect ~10 MB plus overhead.

Soft references (JVM): Wrapping cached values in SoftReference<V> lets the GC evict entries under memory pressure before throwing OutOfMemoryError. Softer than strong references but harder than weak references. Useful for purely in-process caches where memory pressure is unpredictable. The downside: GC-triggered eviction is non-deterministic and may not respect LRU order.

Distributed LRU Cache

Single-node LRU does not survive process restart and cannot scale beyond one machine’s memory. Distributed caching introduces new concerns.

Redis as distributed LRU: Set maxmemory to cap memory usage and maxmemory-policy=allkeys-lru to evict the least recently used key across all keys when the limit is reached. Redis approximates LRU by sampling a configurable number of random keys (maxmemory-samples, default 5) and evicting the least recently used among the sample — a probabilistic but highly effective approximation.

Consistent hashing: Distribute cache keys across multiple Redis nodes using consistent hashing. Each key maps to a specific node; adding or removing nodes only remaps a fraction of keys. Libraries like Jedis with JedisCluster or ioredis handle this transparently.

Two-tier L1/L2 architecture: Each application node keeps a small in-process LRU cache (L1) for the hottest keys. L1 misses fall through to a shared Redis cluster (L2). This dramatically reduces Redis round-trips for the top percentile of traffic while keeping the distributed cache as the source of truth. Invalidation of L1 entries is the main complexity — typically handled via Redis pub/sub or TTL expiry.

Variants

LRU-K: Instead of tracking only the most recent access time, LRU-K tracks the K-th most recent access time and evicts the entry with the oldest K-th access. LRU-2 is the most common variant. This makes the policy more resistant to scan pollution: a one-time sequential scan does not evict entries that have been accessed K or more times. Used in some database buffer pool implementations.

Segmented LRU (SLRU): Divides the cache into two segments — a probationary segment and a protected segment. New entries enter the probationary segment. On a second access, an entry is promoted to the protected segment. Eviction targets the tail of the probationary segment. This closely approximates TinyLFU behavior without the frequency counting overhead, and is the eviction policy used in the Caffeine cache library’s "window" design as the main cache region.

{ “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “What data structures implement an O(1) LRU cache?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “A doubly linked list combined with a hash map. The hash map provides O(1) access to any node; the doubly linked list maintains recency order. On get, the node is unlinked and moved to the head. On put, if the cache is full, the tail node is evicted.” } }, { “@type”: “Question”, “name”: “How does LRU cache differ from LFU cache?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “LRU evicts the least recently used item regardless of frequency. LFU evicts the item with the lowest access frequency. LRU is simpler and performs better for workloads with temporal locality. LFU better handles frequency-skewed workloads where some items are accessed far more often than others.” } }, { “@type”: “Question”, “name”: “How do you make an LRU cache thread-safe?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Use a read-write lock or a segment lock. For high-concurrency, use a concurrent hash map with per-segment LRU lists. Java’s LinkedHashMap with synchronized access or ConcurrentLinkedHashMap (Caffeine) provides thread-safe LRU semantics.” } }, { “@type”: “Question”, “name”: “What is a distributed LRU cache?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “A distributed LRU cache shards key space across multiple nodes using consistent hashing. Each node maintains a local LRU list. On eviction, only the local shard evicts entries, so the global eviction policy is approximate. Redis and Memcached implement distributed LRU with maxmemory-policy settings.” } }, { “@type”: “Question”, “name”: “What is LRU-K and when is it used?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “LRU-K tracks the K-th most recent access time rather than the most recent access. This prevents a single scan from flushing frequently accessed items. PostgreSQL uses LRU-2 (clock with 2-bit reference counters) for its buffer pool to handle scan-resistant eviction.” } } ] }

What data structures implement an O(1) LRU cache?

A doubly linked list combined with a hash map. The hash map provides O(1) access to any node; the doubly linked list maintains recency order. On get, the node is unlinked and moved to the head. On put, if the cache is full, the tail node is evicted.

How does LRU cache differ from LFU cache?

LRU evicts the least recently used item regardless of frequency. LFU evicts the item with the lowest access frequency. LRU is simpler and performs better for workloads with temporal locality. LFU better handles frequency-skewed workloads where some items are accessed far more often than others.

How do you make an LRU cache thread-safe?

Use a read-write lock or a segment lock. For high-concurrency, use a concurrent hash map with per-segment LRU lists. Java’s LinkedHashMap with synchronized access or ConcurrentLinkedHashMap (Caffeine) provides thread-safe LRU semantics.

What is a distributed LRU cache?

A distributed LRU cache shards key space across multiple nodes using consistent hashing. Each node maintains a local LRU list. On eviction, only the local shard evicts entries, so the global eviction policy is approximate. Redis and Memcached implement distributed LRU with maxmemory-policy settings.

What is LRU-K and when is it used?

LRU-K tracks the K-th most recent access time rather than the most recent access. This prevents a single scan from flushing frequently accessed items. PostgreSQL uses LRU-2 (clock with 2-bit reference counters) for its buffer pool to handle scan-resistant eviction.

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top