Question 1

How is the split boundary key selected during shard splitting?

Accepted Answer

The split boundary is chosen at the median key of the shard's key range, computed by sampling or exact count (e.g., SQL PERCENTILE_DISC). The goal is to produce two equally-sized shards after the split. For hash-based sharding, the boundary is chosen in the hash space. For range-based sharding, the boundary is a value in the actual key domain. The median minimizes the worst-case imbalance after the split.

Question 2

How is consistency maintained during dual-write migration?

Accepted Answer

During dual-write, all writes are routed to both source and destination shards based on the key's position relative to the split boundary. Reads continue to be served from the source. The dual-write phase ensures the destination stays current even as the bulk copy runs. Once the bulk copy completes and the delta sync lag reaches near zero, cutover updates the routing metadata atomically. Stale-map errors after cutover cause clients to refresh their shard map.

Question 3

How is cutover made atomic in shard rebalancing?

Accepted Answer

Cutover updates the shard routing map in the metadata store (etcd, ZooKeeper, or a versioned DB table) in a single transaction. The shard map version is incremented. All routing clients cache the current shard map and refresh on version mismatch. The brief window between the metadata write and cache refresh propagation is handled by returning redirect errors that prompt clients to re-fetch the shard map and retry.

Question 4

How do you roll back a failed shard migration?

Accepted Answer

Rollback is possible because the source shard is kept in read-only status (not deleted) for a grace period after cutover. To roll back: update the routing metadata to point back to the source shard, set the source to active status, stop dual-writes to the destination, and mark the migration as failed. The destination shard is then decommissioned. The rollback window is typically 30-60 minutes, after which the source shard data is cleaned up.

Question 5

What triggers shard split vs merge?

Accepted Answer

A shard split is triggered when a shard's size, key cardinality, or request rate exceeds an upper threshold, dividing it into two child shards at a chosen split key determined by the midpoint or hotspot analysis. A merge is triggered when two adjacent shards both fall below a lower threshold (e.g., combined size is less than half the split threshold), consolidating them to reduce metadata overhead and improve scan locality.

Question 6

How is data migrated during rebalancing with minimal downtime?

Accepted Answer

Rebalancing uses a double-write or copy-then-switch protocol: the source shard streams its key range to the destination node while continuing to serve reads and writes, then a brief consistent-state window (often a few milliseconds with write quiesce) allows the router's ownership metadata to atomically flip to the destination. During the copy phase, writes are mirrored to both source and destination to keep them in sync.

Question 7

How does the router handle requests during migration?

Accepted Answer

The router holds a shard map that is versioned; during migration it may forward requests to the source until the ownership transfer is committed, then atomically update its local shard map and start forwarding to the destination. Requests that arrive during the brief ownership-flip window are either retried transparently by the router or returned a redirect error that the client retries, keeping the transition invisible to callers.

Question 8

How is rebalancing throttled to avoid overloading the cluster?

Accepted Answer

A rebalancing controller enforces a maximum bytes-per-second migration rate per node pair using token-bucket rate limiting, and pauses or slows migration when source or destination node CPU/disk I/O exceeds a configured ceiling. Rebalancing is also scheduled during off-peak windows and the number of concurrent migrations across the cluster is capped to prevent network saturation.

Shard Rebalancing Low-Level Design: Split/Merge Triggers, Data Migration, and Minimal Downtime

What Is Shard Rebalancing?

Rebalancing Triggers

Split: Dividing a Hot Shard

Merge: Combining Small Shards

Data Migration with Dual-Write

Cutover and Atomicity

Consistent Hashing and Virtual Nodes

SQL Schema

Python Implementation Sketch