Question 1

Why do virtual nodes improve data distribution in consistent hashing?

Accepted Answer

With basic consistent hashing (one point per server on the ring), the distribution of keys across servers is uneven. Three servers might own arcs of 15%, 50%, and 35% of the ring -- far from the ideal 33% each. This is because the hash function distributes points uniformly, but with only 3 points, the gaps between them vary wildly (small sample size). Virtual nodes solve this by placing each physical server at many points on the ring (100-200 virtual nodes per server). With 3 servers and 200 virtual nodes each, there are 600 points on the ring. By the law of large numbers, the arcs are much more uniform -- each server owns close to 33% of keys. Additional benefit for node removal: when a server with one physical node fails, all its keys move to the single next-clockwise server, potentially doubling that server load. With virtual nodes, the failed server 200 points are scattered across the ring, so its keys are distributed across many other servers -- no single server receives a disproportionate spike. Similarly, adding a new server with 200 virtual nodes takes a small slice of keys from many existing servers rather than a large chunk from one.

Question 2

How does consistent hashing minimize data movement when nodes are added or removed?

Accepted Answer

With modular hashing (shard = hash(key) % N), changing N from 4 to 5 servers changes the shard assignment for approximately 80% of keys. Every key whose hash modulo changes must be moved -- a massive data migration. With consistent hashing, adding a server moves only approximately 1/N of keys (from the server that previously owned the arc where the new server is placed). Removing a server moves only its keys to the next clockwise server. Example: 4 servers, each owning 25% of the ring. Adding a 5th server: the new server is placed on the ring and takes ownership of ~20% of the keys that were previously on its clockwise neighbor. Only 20% of keys move, and they move from one server to one server. The other 80% of keys are undisturbed. With virtual nodes, the effect is even better: the 20% of moved keys come from multiple servers (because the new server virtual nodes are scattered), so no single server loses a disproportionate amount of data. This minimal disruption is critical for distributed caches where a key move means a cache miss and a potentially expensive database fetch.

Question 3

How does Amazon DynamoDB use consistent hashing for data partitioning?

Accepted Answer

DynamoDB uses consistent hashing to distribute items across storage partitions based on the partition key. When you create a DynamoDB table with a partition key (e.g., user_id), DynamoDB hashes the partition key value using a consistent hashing scheme to determine which storage partition holds that item. Each partition is a unit of storage and throughput -- it can hold up to 10GB of data and serve up to 3000 read capacity units or 1000 write capacity units. When a partition reaches its limits, DynamoDB automatically splits it into two partitions. Consistent hashing ensures that splitting one partition only moves half of its data to the new partition -- all other partitions are unaffected. The partition map (which hash ranges map to which partitions) is stored in the DynamoDB request router. When your application sends a GetItem request with a partition key, the router hashes the key, looks up the partition, and routes the request directly to the correct storage node. This is why DynamoDB achieves single-digit millisecond latency regardless of table size -- every read is a direct lookup, not a scan. The original Amazon Dynamo paper (2007) described the consistent hashing ring in detail; modern DynamoDB has evolved the internals but the core principle remains.

Question 4

What is the difference between consistent hashing and rendezvous hashing?

Accepted Answer

Both consistent hashing and rendezvous hashing (also called highest random weight hashing) solve the same problem: distributing keys across servers with minimal disruption when servers change. They differ in mechanism. Consistent hashing: servers and keys are placed on a hash ring. A key is assigned to the first server found clockwise. Requires virtual nodes for even distribution. Lookup is O(log N) with a sorted data structure. Rendezvous hashing: for each key, compute a score for every server: score = hash(key, server_id). Assign the key to the server with the highest score. When a server is removed, only its keys are reassigned (to the server with the next-highest score for each key). No virtual nodes needed -- the distribution is naturally even because each key independently selects the server with the highest random score. Lookup is O(N) because you compute a hash for every server. Trade-offs: consistent hashing is O(log N) lookup but requires virtual nodes for balance. Rendezvous hashing is O(N) lookup but naturally balanced without virtual nodes. For small N (under 100 servers), the O(N) cost of rendezvous hashing is negligible. For large N (thousands of servers), consistent hashing with virtual nodes is more efficient. Rendezvous hashing is used in Microsoft Azure, some CDN implementations, and distributed caches where server count is moderate.

System Design: Consistent Hashing — Virtual Nodes, Hash Ring, Load Balancing, Distributed Cache, Dynamo, Cassandra

The Problem with Simple Hashing

The Hash Ring

Virtual Nodes for Even Distribution

Replication with Consistent Hashing

Consistent Hashing in Production Systems

Implementation Details