Design a Distributed ID Generator (Snowflake)

⏱ 4 min read

Design a distributed ID generation service. Every large-scale system needs unique IDs — for database rows, events, messages, and transactions. The requirements sound simple but the constraints are surprisingly subtle: the IDs must be globally unique, generated at high throughput, and ideally sortable by creation time.

Requirements

Globally unique: No two IDs are ever the same, across all generators in all data centers.
Sortable by time: IDs generated later should sort after earlier ones. This enables time-range queries on the ID column without a separate timestamp.
High throughput: 100,000+ IDs/sec per generator node.
Low latency: ID generation < 1ms — this happens on every database write.
No coordination: Generators should not need to coordinate with each other. Network calls on every ID generation would be a bottleneck.

Why Not UUID?

UUID v4 (random) satisfies uniqueness globally without coordination. But:

Not sortable: Random UUIDs don’t sort by time — you can’t do “events in the last hour” using the ID column.
128-bit: Twice the storage of a 64-bit integer. In a table with billions of rows and multiple indexes, this matters.
Poor index locality: Random values cause scattered inserts into B-tree indexes, thrashing the index cache. UUID primary keys in MySQL with InnoDB clustered indexes are notoriously slow at scale.

UUID v7 (2022 RFC) fixes sortability with a time prefix. A good option for new systems. But many companies adopted Snowflake-style IDs before UUID v7 existed.

Twitter Snowflake: The Standard Approach

Twitter designed Snowflake in 2010 to generate 64-bit IDs at scale without coordination. The bit layout:

| 1 bit  | 41 bits          | 10 bits     | 12 bits  |
| unused | timestamp (ms)   | machine_id  | sequence |

Total: 64 bits = fits in a signed long integer

Timestamp (41 bits): Milliseconds since a custom epoch (e.g., 2024-01-01). 41 bits → 2^41 = 2.2 trillion milliseconds → ~69 years before rollover.
Machine ID (10 bits): Uniquely identifies the generator node. 2^10 = 1,024 machines max. Assigned at startup from ZooKeeper or a config service.
Sequence number (12 bits): Incremented for each ID generated within the same millisecond. 2^12 = 4,096 IDs per millisecond per machine = 4M IDs/sec per machine.

import time

EPOCH = 1704067200000  # 2024-01-01 00:00:00 UTC in ms
MACHINE_ID = 42        # assigned at startup

sequence = 0
last_timestamp = -1

def generate_id():
    global sequence, last_timestamp

    timestamp = int(time.time() * 1000) - EPOCH

    if timestamp == last_timestamp:
        sequence = (sequence + 1) & 0xFFF  # mask to 12 bits
        if sequence == 0:
            # Sequence exhausted this millisecond — wait for next ms
            while timestamp <= last_timestamp:
                timestamp = int(time.time() * 1000) - EPOCH
    else:
        sequence = 0

    last_timestamp = timestamp

    return (timestamp << 22) | (MACHINE_ID << 12) | sequence

This runs entirely in-process — no network call, no lock contention beyond a local mutex. Each call takes nanoseconds.

The Clock Skew Problem

What if the system clock goes backward (NTP correction, leap second)? The timestamp field would decrease, potentially generating IDs lower than previously issued ones — violating uniqueness or sort order.

Mitigations:

Refuse to generate IDs when clock moves backward: Wait until the clock catches back up to last_timestamp. For small skews (<10ms), this stall is acceptable.
Log and alert on clock skew: Large skews indicate a system problem that needs human intervention.
Use a monotonic clock: Most OS APIs offer a monotonic clock that never goes backward (it just slows down). Use this instead of wall clock for the timestamp component.

Machine ID Assignment

Machine IDs must be unique across all generator nodes. Two options:

ZooKeeper / etcd sequential node: Each generator registers in ZooKeeper at startup, gets a sequential node number. Reliable, requires ZooKeeper dependency. Twitter’s original Snowflake used this.

Database auto-increment: Insert a row into a “machines” table on startup, get the auto-incremented ID as machine_id. Simple, no extra dependency.

IP-based (last octet): Use the last octet of the machine’s private IP as machine_id. Works for small deployments (<256 machines). Breaks if you have multiple services on the same host or more than 256 machines.

Alternatives and Comparison

Approach	Globally unique	Time-sortable	Coordination needed	Size
UUID v4	✓ (probabilistic)	✗	None	128 bits
UUID v7	✓ (probabilistic)	✓	None	128 bits
Twitter Snowflake	✓	✓	Machine ID only (startup)	64 bits
MongoDB ObjectID	✓	✓	None	96 bits
Redis INCR	✓	✓	Always (network call)	64 bits
DB auto-increment	✓ (single DB)	✓	Always (network call)	64 bits
Flickr ticket server	✓	✓	Always (network call)	64 bits

Instagram’s Approach

Instagram (2012) uses a PostgreSQL-based approach that avoids a dedicated ID service entirely. A PL/pgSQL function runs inside the database:

-- Instagram's ID generation stored procedure
CREATE OR REPLACE FUNCTION next_id(OUT result bigint) AS $$
DECLARE
    epoch bigint := 1314220021721;  -- custom epoch
    seq_id bigint;
    now_ms bigint;
    shard_id int := 5;  -- current shard number, set per DB
BEGIN
    SELECT nextval('global_id_sequence') % 1024 INTO seq_id;
    SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_ms;
    result := (now_ms - epoch) << 23;
    result := result | (shard_id << 10);
    result := result | seq_id;
END;
$$ LANGUAGE plpgsql;

Advantage: IDs are generated inside the same transaction as the row insert — perfect atomicity. No external dependency. Works because Instagram’s write path already hits Postgres.

Snowflake as a Service

For microservices that can’t embed ID generation logic, expose Snowflake as a gRPC service:

service IDGenerator {
  rpc GenerateID(GenerateRequest) returns (GenerateResponse);
  rpc GenerateBatch(BatchRequest) returns (BatchResponse);  // return N IDs at once
}

Batch generation (return 100 IDs per call) amortizes network latency. Callers cache a batch locally and draw from it, falling back to a network call when the batch is exhausted.

Interview Follow-ups

Snowflake IDs will roll over in ~69 years. How do you handle the epoch migration when that happens?
How do you handle a generator node that crashes and its machine_id is reassigned to a new node before all in-flight IDs are persisted?
If you have more than 1,024 generator nodes, how do you extend the Snowflake format?
How do you generate IDs that are not sequential (to prevent enumeration attacks) while still being sortable?

Database Sharding — Snowflake IDs enable shard routing: the machine_id encodes which shard owns the record
Consistent Hashing — an alternative approach to ID-to-shard mapping without embedding shard info in the ID
CAP Theorem — ID generation chooses AP: generators work independently (available during partition) but require clock sync for ordering guarantees