Question 1

How do virtual nodes improve load balancing in consistent hashing?

Accepted Answer

In basic consistent hashing, each physical node occupies one point on the hash ring, causing uneven load distribution — especially when nodes have different capacities or when few nodes are present. Virtual nodes (vnodes) assign each physical node multiple positions on the ring, typically 100-200 per node. A key maps to the nearest vnode clockwise, which maps to its physical owner. This spreads load more uniformly (variance decreases as O(1/sqrt(vnodes))), allows heterogeneous nodes (a server with 2x RAM gets 2x vnodes), and reduces data movement on node addition/removal since each new node takes small slices from many existing nodes rather than one large slice. Cassandra and DynamoDB both use vnodes with configurable replication factors.

Question 2

What is bounded load consistent hashing and why does it matter?

Accepted Answer

Bounded load consistent hashing (proposed by Google in 2017) adds a capacity constraint to the ring: each server accepts at most ceil((1 + epsilon) * n/k) requests, where n is total requests, k is the number of servers, and epsilon is the load imbalance factor (typically 0.25). When a request's preferred server is at capacity, it falls through to the next server on the ring. This prevents hot spots from overwhelming individual nodes — standard consistent hashing offers no such guarantee and a popular key space can overload a node. The algorithm is online (no global coordination needed), adds only O(log k) overhead per request, and is used by Google in their load balancers. It is critical in scenarios where key popularity is skewed (celebrity users, viral content).

Question 3

How does rendezvous hashing compare to ring-based consistent hashing?

Accepted Answer

Rendezvous hashing (highest random weight, HRW) assigns a key to the server with the highest score(key, server_i) — typically hash(key + server_id). To find the owner, score all servers and pick the max, giving O(k) lookup vs O(log k) for a ring with a sorted structure. On node removal, only keys owned by that node redistribute; all others stay, matching consistent hashing's property. Advantages of rendezvous: simpler implementation (no ring, no vnodes), naturally even distribution without vnodes, easy to implement weighted variants (repeat high-capacity servers). Disadvantages: O(k) lookup cost grows linearly with cluster size, making it impractical for very large clusters (1000+ nodes). Ring hashing with vnodes scales better for large clusters; rendezvous hashing is preferred for smaller, stable clusters like CDN origin selection.

Question 4

How does jump consistent hashing work and what are its tradeoffs?

Accepted Answer

Jump consistent hashing maps a 64-bit key to a bucket in [0, n) using a stateless algorithm: starting with bucket = 0, repeatedly jump to a new candidate bucket using b = floor((j+1) * (2^31) / (next_random)), stopping when the candidate exceeds n-1. It runs in O(log n) time, uses no memory (no ring, no table), and distributes keys perfectly uniformly. The critical constraint: it only supports adding buckets at the end — you cannot remove an arbitrary bucket without remapping all keys. This makes it ideal for append-only cluster expansion (adding shards to a database cluster) but unsuitable for arbitrary node removal (e.g., a node fails mid-cluster). Google uses it for distributed storage sharding. Compare to ring hashing: jump is faster and uses less memory but lacks the flexibility to remove arbitrary nodes.

Question 5

How do you handle hotspot nodes in consistent hashing?

Accepted Answer

Hotspots occur when a small set of keys (celebrity users, trending items) generates disproportionate traffic to one node. Mitigation strategies: (1) Key salting — append a random suffix (0 to R) to the key before hashing, distributing one logical key across R nodes; reads must query all R shards and merge. (2) Bounded load hashing — enforce per-node capacity limits and overflow to the next node. (3) Dedicated hot-key cache — detect hot keys via frequency counting (Count-Min Sketch) and serve them from a separate, replicated cache tier. (4) Local in-process caching — for extreme hot keys, cache at the application layer to absorb traffic before it hits the distributed cache. (5) Shard splitting — dynamically split the hot node's virtual node range across two physical nodes. Instagram and Twitter use combinations of these techniques to handle celebrity account traffic spikes.

Low Level Design: Consistent Hashing Deep Dive

Introduction

Hash Ring

Virtual Nodes

Load Imbalance and Bounded Loads

Rendezvous Hashing (Highest Random Weight)

Jump Consistent Hashing

Applications