System Design: Consistent Hashing — Virtual Nodes, Hash Ring, Load Balancing, Distributed Cache, Dynamo, Cassandra

Consistent hashing is a foundational algorithm in distributed systems that enables even data distribution with minimal disruption when nodes are added or removed. It powers Amazon DynamoDB, Apache Cassandra, Memcached, and content delivery networks. This guide provides a deep technical dive into consistent hashing — how it works, why virtual nodes are essential, and how production systems use it — essential knowledge for system design interviews.

The Problem with Simple Hashing

Simple hash-based distribution: server = hash(key) % N, where N is the number of servers. This works well when N is fixed. Problem: when a server is added or removed, N changes, and the shard assignment for almost every key changes. For N=4 to N=5: approximately 80% of keys map to a different server. In a distributed cache, this means 80% cache miss — every key must be fetched from the database and re-cached. For a high-traffic system, this cache miss storm can overwhelm the database and cause a cascading failure. Consistent hashing solves this: when a server is added, only approximately 1/N of keys need to move (from the adjacent server on the hash ring).

The Hash Ring

Consistent hashing maps both keys and servers onto a circular hash space (the hash ring) using the same hash function (e.g., SHA-256 or xxHash). The ring represents the full range of hash values: 0 to 2^32-1 (for a 32-bit hash), wrapping around from the maximum back to 0. Each server is placed on the ring at position hash(server_identifier). Each key is placed at position hash(key). To find which server owns a key: start at the key position on the ring, walk clockwise until you reach a server. That server is the owner. Adding a server: the new server is placed on the ring. Only keys between the new server and its counter-clockwise neighbor are reassigned to the new server (they were previously owned by the clockwise neighbor). All other keys remain on their existing servers. Removing a server: its keys are reassigned to the next clockwise server. Only approximately 1/N of keys move. This minimal disruption is the core advantage of consistent hashing.

Virtual Nodes for Even Distribution

Basic consistent hashing has a problem: with only N points on the ring (one per physical server), the key distribution is uneven. Some servers own large arcs of the ring and handle disproportionately more keys. With 3 servers, one might handle 50% of keys while another handles 15%. Virtual nodes solve this: instead of placing each physical server at one point on the ring, place it at V points (V = 100-200 virtual nodes per physical server). Server A is represented by hash(“A-0”), hash(“A-1”), …, hash(“A-199”). With 3 servers and 200 virtual nodes each, there are 600 points on the ring. The law of large numbers ensures that each server owns approximately 1/3 of the keys. Additional benefit: when a server is removed, its load is distributed across all remaining servers (not just the next clockwise server), because its virtual nodes are scattered across the ring. Heterogeneous servers: assign more virtual nodes to more powerful servers. A server with 2x capacity gets 2x virtual nodes and handles approximately 2x the keys.

Replication with Consistent Hashing

Consistent hashing naturally supports replication for fault tolerance. To replicate data with factor R=3: a key is stored on the first 3 distinct physical servers found by walking clockwise from the key position on the ring. “Distinct physical servers” is important — if the next 3 points on the ring are virtual nodes of the same physical server, walk further until you find 3 different physical servers. This ensures replicas are on different machines. Amazon Dynamo uses this approach: each key is replicated to N coordinator nodes (N=3 typically). The first node in the clockwise walk is the coordinator. Reads and writes use quorum: write to W nodes and read from R nodes. If W + R > N, reads are guaranteed to see the latest write (quorum overlap). Dynamo default: N=3, W=2, R=2. This provides strong consistency for individual keys while tolerating one node failure for both reads and writes.

Consistent Hashing in Production Systems

Amazon DynamoDB: uses consistent hashing to distribute data across storage nodes. Each table is partitioned by the partition key. The partition key is hashed, and the hash determines which storage partition holds the data. Adding partitions (to handle increased throughput) moves a minimal set of keys. Apache Cassandra: uses consistent hashing with virtual nodes (called vnodes, default 256 per node). The token ring maps partition keys to nodes. When a node joins the cluster, it takes ownership of a portion of the token ranges from existing nodes. Memcached: the client library (libmemcached) uses consistent hashing to distribute cache keys across a pool of Memcached servers. Adding a server causes approximately 1/N of keys to miss (move to the new server) rather than the near-total miss of modular hashing. Content delivery networks (CDNs): use consistent hashing to map URLs to edge cache servers. When a server is removed (failure), only its URLs are re-routed to adjacent servers, keeping the cache warm for all other URLs.

Implementation Details

Implementing consistent hashing: (1) Data structure — use a sorted array or balanced binary search tree (TreeMap in Java, SortedList in Python) to store the ring positions of all virtual nodes. Lookup is O(log(N * V)) using binary search to find the first position >= hash(key). (2) Hash function — use a fast, uniform hash function. xxHash or MurmurHash3 are good choices (fast, uniform distribution). MD5 and SHA-256 work but are slower (cryptographic overhead is unnecessary for hashing). (3) Node management — when adding a node: compute V hash positions, insert into the sorted structure, and initiate key migration from the servers that lose key ranges. When removing a node: remove its V positions from the sorted structure and migrate its keys to the new owners. (4) Jump consistent hashing — a simpler algorithm (Google, 2014) that maps keys to buckets using a deterministic jump function. It provides perfect balance and O(1) time per lookup, but does not support arbitrary node names (only integer bucket IDs) and does not handle weighted nodes as elegantly as virtual nodes. Use it when nodes are identified by sequential integers (e.g., cache server pool).

Scroll to Top