Database sharding is the technique of distributing data across multiple database instances to handle datasets and query volumes that exceed a single server capacity. Sharding is one of the most critical topics in system design interviews because it touches on data modeling, query routing, consistency, and operational complexity. This guide covers shard key selection, resharding strategies, and production sharding tools like Vitess and Citus.
When to Shard
Sharding is a last resort, not a first optimization. Before sharding, exhaust these options: (1) Vertical scaling — upgrade to a larger server (more CPU, RAM, faster SSD). A single PostgreSQL instance can handle billions of rows and thousands of transactions per second with proper indexing. (2) Read replicas — offload read queries to replicas. Most applications are read-heavy (90%+ reads). Read replicas handle the load without sharding. (3) Caching — cache frequently accessed data in Redis. This eliminates database queries entirely for hot data. (4) Query optimization — missing indexes, N+1 queries, and full table scans cause more performance problems than data volume. (5) Table partitioning — partition large tables by date range (orders_2024_q1, orders_2024_q2). The database handles routing internally without application changes. Shard when: a single server cannot store the dataset (data exceeds available disk/RAM), write throughput exceeds what one server can handle, or regulatory requirements mandate data residency (EU data in EU shards, US data in US shards).
Shard Key Selection
The shard key determines which shard holds each row. It is the most important sharding decision and cannot easily be changed later. Criteria for a good shard key: (1) Even data distribution — the key should distribute data uniformly across shards. user_id with hash-based routing distributes evenly. country_code does not (the US shard would be 10x larger than others). (2) Query locality — queries should access a single shard whenever possible. If most queries include user_id in the WHERE clause, sharding by user_id means each query hits one shard. Scatter-gather queries (no shard key in the WHERE clause) hit all shards and are expensive. (3) Growth accommodation — the key should support future data growth without creating hot shards. An auto-incrementing order_id with range-based sharding creates a hot shard (all new orders go to the latest range). Hash-based sharding distributes new orders evenly. Common shard keys: user_id (social networks, SaaS), tenant_id (multi-tenant applications), order_id with hashing (e-commerce), geographic region (data residency requirements).
Sharding Strategies
Three approaches to mapping keys to shards: (1) Range-based sharding — assign key ranges to shards. user_id 1-1M -> shard 1, 1M-2M -> shard 2. Pros: range queries are efficient (all users in a range are co-located). Cons: hot spots if recent data is accessed more frequently (all new users are on the last shard). (2) Hash-based sharding — hash the shard key and modulo by the number of shards: shard = hash(user_id) % N. Pros: even distribution regardless of key distribution. Cons: range queries require scatter-gather (adjacent keys map to different shards), and adding a shard (changing N) requires reshuffling most data. (3) Directory-based sharding — a lookup table maps each key to a shard: user_id 12345 -> shard 3. Pros: flexible placement (move specific users to different shards), supports custom routing logic. Cons: the lookup table becomes a single point of failure and must be highly available. (4) Consistent hashing — a variation of hash-based sharding that minimizes data movement when shards are added or removed. Each shard owns a range on a hash ring; adding a shard only moves data from adjacent ranges.
Cross-Shard Queries and Joins
The biggest operational challenge of sharding: queries that span multiple shards. A query like “find all orders from the last 24 hours” (no shard key) must be sent to all shards, executed in parallel, and the results merged (scatter-gather). This is orders of magnitude slower than a single-shard query. Strategies to minimize cross-shard queries: (1) Co-locate related data — shard orders and order_items by the same key (user_id). A query for a user orders and their items hits a single shard. (2) Denormalize — store frequently joined data in the same table. Instead of joining orders with products, store product_name and product_price directly in the order_items table (denormalized copy). (3) Global tables — small, read-heavy reference tables (countries, currencies, product categories) are replicated to all shards. Joins with these tables are local to each shard. (4) Application-level joins — for rare cross-shard queries, fetch data from each shard separately in the application layer and join in memory. This is acceptable for low-frequency administrative queries, not for user-facing traffic.
Resharding: Adding and Removing Shards
As data grows, you need more shards. Resharding is the process of redistributing data across a new shard count. With hash-based sharding (shard = hash(key) % N), changing N from 4 to 8 changes the shard assignment for approximately 75% of keys — massive data movement. Strategies to minimize resharding cost: (1) Consistent hashing — adding a shard moves only data from neighboring ranges on the hash ring, approximately 1/N of total data. (2) Virtual shards — create 256 logical shards mapped to 4 physical shards (64 logical per physical). When adding a physical shard, move some logical shards to it. No rehashing required — just update the logical-to-physical mapping. (3) Double-write migration — write to both old and new shard assignments during migration. Read from the old assignment. Backfill old data to the new assignment. Switch reads to the new assignment. Stop writing to the old assignment. (4) Use Vitess — Vitess (YouTube sharding middleware, now CNCF) handles resharding automatically. Define the new sharding scheme, and Vitess copies data, verifies consistency, and switches traffic with minimal downtime.
Production Sharding Tools
Building custom sharding logic is error-prone. Production tools: (1) Vitess — a database clustering system for MySQL. Provides a SQL-aware proxy (vtgate) that routes queries to the correct shard. Supports horizontal resharding, vertical sharding (splitting tables across clusters), and schema migrations. Used by Slack, Square, GitHub, and PlanetScale. (2) Citus — a PostgreSQL extension for horizontal sharding. Distributes tables across worker nodes. Supports distributed queries, co-located joins, and reference tables. Available as open-source or as Azure Cosmos DB for PostgreSQL. (3) CockroachDB — a distributed SQL database that shards automatically. Data is split into ranges (64MB chunks) and distributed across nodes using Raft consensus. No manual shard key selection — CockroachDB handles distribution and rebalancing automatically. (4) Amazon Aurora — supports up to 128TB per instance, delaying the need for application-level sharding. Aurora Limitless (preview) adds automatic sharding for PostgreSQL beyond 128TB. Choose Vitess for MySQL shops needing control, Citus for PostgreSQL, and CockroachDB for new applications wanting automatic distribution.