Low Level Design: Consistent Hashing Deep Dive

Introduction

Consistent hashing assigns keys to nodes such that adding or removing a node remaps only K/N keys on average, where K is the total number of keys and N is the number of nodes. In contrast, naive modulo hashing (key % N) remaps nearly all keys when N changes. Consistent hashing is critical for distributed caches, databases, and load balancers that must scale without wholesale cache invalidation or data migration.

Hash Ring

Both keys and nodes are mapped to points on a circular hash space ranging from 0 to 2^32 – 1. Node n maps to hash(n) on the ring; key k is assigned to the first node clockwise from hash(k). When a node is added, the new node takes over the portion of keys previously owned by its clockwise neighbor. When a node is removed, its keys are reassigned to its clockwise successor. In both cases, only K/N keys are remapped on average. The weakness of basic consistent hashing is uneven distribution: with few nodes, hash positions may cluster and leave some nodes handling far more keys than others.

Virtual Nodes

Each physical node maps to V virtual nodes on the ring, typically 100 to 200. This solves several problems: (1) with N*V points on the ring, load distributes far more uniformly; (2) when a node is added, it draws keys evenly from all existing nodes rather than only from its single clockwise neighbor; (3) heterogeneous capacity is handled naturally — a more powerful node receives more virtual node assignments proportional to its capacity. The downside is memory overhead for storing the ring structure, though at V=200 this is entirely manageable. Cassandra and DynamoDB both use virtual nodes (vnodes) for this reason.

Load Imbalance and Bounded Loads

Even with virtual nodes, variance in load distribution exists and can become significant under skewed key access patterns. Google’s 2017 bounded load consistent hashing adds a hard constraint: no node may handle more than (1 + epsilon) times the average load. When the assigned node is at capacity, the algorithm probes the next clockwise node on the ring. With epsilon = 0.25, load stays near-uniform with only slightly increased average probe count. Google uses this approach in the Maglev load balancer to ensure backend servers receive balanced traffic even when consistent hashing alone would create hotspots.

Rendezvous Hashing (Highest Random Weight)

Rendezvous hashing is an alternative to the ring structure. For key k, assign it to the node n* = argmax over all nodes n of hash(k, n). Every node independently computes the hash of the (key, node) pair and the node with the highest hash value wins. No shared ring data structure is required. The algorithm produces consistent results: any node that knows the node list computes the same assignment. On node removal, affected keys are reassigned to the node with the second-highest hash for each key. The cost is O(N) computation per lookup, acceptable for CDN origin selection where N is small and correctness matters more than lookup speed.

Jump Consistent Hashing

Jump consistent hashing achieves O(1) time and O(1) space for consistent hashing when nodes are numbered sequentially. The algorithm is: b = -1, j = 0; while j < num_buckets { b = j; key = key * 2862933555777941757 + 1; j = floor((b + 1) * (2^31 / ((key >> 33) + 1))) }; return b. The constraint is that nodes must be numbered sequentially and new nodes can only be added at the end — arbitrary node removal is not supported. This makes jump consistent hashing ideal for scenarios where the node count grows monotonically, such as storage shards that are never removed. It is extremely fast in practice due to its minimal computation.

Applications

Consistent hashing is foundational in distributed systems. Memcached and Redis clusters use client-side consistent hashing to map cache keys to specific nodes, ensuring cache hits remain stable when nodes are added or removed. DynamoDB and Cassandra use a partition ring where each partition key maps to a primary node and its replica set. CDN providers use consistent hashing (often rendezvous hashing) to map content URLs to edge servers, ensuring a given URL is consistently served by the same edge node to maximize cache hit rates. The core guarantee — that only K/N keys are remapped per node change — is what makes horizontal scaling practical in all of these systems.

{ “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “How do virtual nodes improve load balancing in consistent hashing?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “In basic consistent hashing, each physical node occupies one point on the hash ring, causing uneven load distribution — especially when nodes have different capacities or when few nodes are present. Virtual nodes (vnodes) assign each physical node multiple positions on the ring, typically 100-200 per node. A key maps to the nearest vnode clockwise, which maps to its physical owner. This spreads load more uniformly (variance decreases as O(1/sqrt(vnodes))), allows heterogeneous nodes (a server with 2x RAM gets 2x vnodes), and reduces data movement on node addition/removal since each new node takes small slices from many existing nodes rather than one large slice. Cassandra and DynamoDB both use vnodes with configurable replication factors.” } }, { “@type”: “Question”, “name”: “What is bounded load consistent hashing and why does it matter?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Bounded load consistent hashing (proposed by Google in 2017) adds a capacity constraint to the ring: each server accepts at most ceil((1 + epsilon) * n/k) requests, where n is total requests, k is the number of servers, and epsilon is the load imbalance factor (typically 0.25). When a request’s preferred server is at capacity, it falls through to the next server on the ring. This prevents hot spots from overwhelming individual nodes — standard consistent hashing offers no such guarantee and a popular key space can overload a node. The algorithm is online (no global coordination needed), adds only O(log k) overhead per request, and is used by Google in their load balancers. It is critical in scenarios where key popularity is skewed (celebrity users, viral content).” } }, { “@type”: “Question”, “name”: “How does rendezvous hashing compare to ring-based consistent hashing?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Rendezvous hashing (highest random weight, HRW) assigns a key to the server with the highest score(key, server_i) — typically hash(key + server_id). To find the owner, score all servers and pick the max, giving O(k) lookup vs O(log k) for a ring with a sorted structure. On node removal, only keys owned by that node redistribute; all others stay, matching consistent hashing’s property. Advantages of rendezvous: simpler implementation (no ring, no vnodes), naturally even distribution without vnodes, easy to implement weighted variants (repeat high-capacity servers). Disadvantages: O(k) lookup cost grows linearly with cluster size, making it impractical for very large clusters (1000+ nodes). Ring hashing with vnodes scales better for large clusters; rendezvous hashing is preferred for smaller, stable clusters like CDN origin selection.” } }, { “@type”: “Question”, “name”: “How does jump consistent hashing work and what are its tradeoffs?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Jump consistent hashing maps a 64-bit key to a bucket in [0, n) using a stateless algorithm: starting with bucket = 0, repeatedly jump to a new candidate bucket using b = floor((j+1) * (2^31) / (next_random)), stopping when the candidate exceeds n-1. It runs in O(log n) time, uses no memory (no ring, no table), and distributes keys perfectly uniformly. The critical constraint: it only supports adding buckets at the end — you cannot remove an arbitrary bucket without remapping all keys. This makes it ideal for append-only cluster expansion (adding shards to a database cluster) but unsuitable for arbitrary node removal (e.g., a node fails mid-cluster). Google uses it for distributed storage sharding. Compare to ring hashing: jump is faster and uses less memory but lacks the flexibility to remove arbitrary nodes.” } }, { “@type”: “Question”, “name”: “How do you handle hotspot nodes in consistent hashing?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Hotspots occur when a small set of keys (celebrity users, trending items) generates disproportionate traffic to one node. Mitigation strategies: (1) Key salting — append a random suffix (0 to R) to the key before hashing, distributing one logical key across R nodes; reads must query all R shards and merge. (2) Bounded load hashing — enforce per-node capacity limits and overflow to the next node. (3) Dedicated hot-key cache — detect hot keys via frequency counting (Count-Min Sketch) and serve them from a separate, replicated cache tier. (4) Local in-process caching — for extreme hot keys, cache at the application layer to absorb traffic before it hits the distributed cache. (5) Shard splitting — dynamically split the hot node’s virtual node range across two physical nodes. Instagram and Twitter use combinations of these techniques to handle celebrity account traffic spikes.” } } ] }