Question 1

Why does consistent hashing minimize key remapping when servers change?

Accepted Answer

With modulo hashing (key mod N), adding one server changes the mapping for nearly all keys because N changes. Consistent hashing maps servers and keys onto a circular ring. Adding a server only takes ownership of keys between its predecessor and itself on the ring, remapping only K/N keys on average (where K is total keys and N is server count). Removing a server transfers only its keys to the next server clockwise. This minimizes cache invalidation and data movement during cluster scaling.

Question 2

What are virtual nodes in consistent hashing and why are they needed?

Accepted Answer

With few physical servers, hash positions cluster unevenly on the ring, causing some servers to own far more keys than others (hot spots). Virtual nodes assign each physical server multiple positions on the ring (e.g., 150 virtual nodes per server in DynamoDB). This spreads each server across the ring, achieving statistically even key distribution regardless of ring size. Virtual nodes also enable capacity-weighted distribution: give larger servers more virtual nodes proportional to their capacity.

Question 3

How does replication work with consistent hashing?

Accepted Answer

For fault tolerance, a key is replicated to R servers: the primary (first server clockwise from the key on the ring) plus the next R-1 distinct physical servers clockwise, skipping virtual nodes of the same physical server. This preference list ensures replicas are on different physical machines. Reads and writes use quorum: with N=3 replicas, W=2 for writes and R=2 for reads ensures W+R > N, guaranteeing at least one replica overlap for strong consistency. DynamoDB and Cassandra both use this approach.

Question 4

What is rendezvous hashing and how does it compare to ring-based consistent hashing?

Accepted Answer

Rendezvous hashing (highest random weight) computes a hash score for each (key, server) pair and assigns the key to the server with the highest score. No ring data structure is needed. When a server is added or removed, only affected keys change. Lookup is O(N) — compute N hashes and take the max. Ring-based consistent hashing is O(log N) using binary search. Rendezvous hashing is simpler and equally correct for small N (under 50 servers); ring-based hashing is better for large clusters.

Low Level Design: Consistent Hashing

The Hash Ring

Virtual Nodes for Load Balancing

Replication with Preference Lists

Rendezvous Hashing: An Alternative

Key Interview Discussion Points