Distributed Counter System Low-Level Design: Exact, Sharded, Redis, and HyperLogLog Strategies

Distributed Counter System: Low-Level Design

A distributed counter tracks a numeric value that many servers increment or decrement concurrently — page views, likes, inventory quantities, API rate limits, and active user counts all depend on counters that must be correct under high write contention. This design covers the trade-off spectrum from exact Postgres counters to approximate probabilistic structures, and shows how to choose the right approach based on required accuracy, write throughput, and read latency.

Core Data Model

-- Exact counter for critical values (inventory, payment limits)
CREATE TABLE Counter (
    counter_id     VARCHAR(200) PRIMARY KEY,  -- "inventory:SKU-123", "likes:post:456"
    value          BIGINT NOT NULL DEFAULT 0,
    updated_at     TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Sharded counter for high-throughput approximate values (view counts, likes)
CREATE TABLE CounterShard (
    counter_id     VARCHAR(200) NOT NULL,
    shard_id       SMALLINT NOT NULL,          -- 0..N_SHARDS-1
    value          BIGINT NOT NULL DEFAULT 0,
    updated_at     TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (counter_id, shard_id)
);

-- Delta buffer table for batched increments (very high write rates)
CREATE TABLE CounterDelta (
    delta_id       BIGSERIAL PRIMARY KEY,
    counter_id     VARCHAR(200) NOT NULL,
    delta          BIGINT NOT NULL,            -- +1, -1, or bulk adjustment
    created_at     TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);

CREATE INDEX ON CounterDelta(counter_id, created_at);

Strategy 1: Exact Postgres Counter (Low Write Rate)

def increment(counter_id: str, delta: int = 1) -> int:
    """
    Atomic increment using INSERT ... ON CONFLICT UPDATE.
    Correct under concurrent writes; suitable for up to ~5K writes/second.
    Returns new value.
    """
    row = db.fetchone("""
        INSERT INTO Counter (counter_id, value, updated_at)
        VALUES (%s, %s, NOW())
        ON CONFLICT (counter_id) DO UPDATE
            SET value = Counter.value + EXCLUDED.value,
                updated_at = NOW()
        RETURNING value
    """, (counter_id, delta))
    return row['value']

def decrement_with_floor(counter_id: str, delta: int = 1, floor: int = 0) -> int:
    """
    Decrement but never go below floor. Used for inventory: can't sell what you don't have.
    Uses UPDATE ... WHERE value >= floor + delta to prevent going negative atomically.
    """
    row = db.fetchone("""
        UPDATE Counter
        SET value = value - %s, updated_at = NOW()
        WHERE counter_id = %s AND value >= %s
        RETURNING value
    """, (delta, counter_id, floor + delta))
    if not row:
        raise InsufficientInventoryError(f"Counter {counter_id} has fewer than {delta} remaining")
    return row['value']

def get_count(counter_id: str) -> int:
    row = db.fetchone("SELECT value FROM Counter WHERE counter_id=%s", (counter_id,))
    return row['value'] if row else 0

Strategy 2: Redis Counter (High Write Rate, Eventual Persistence)

import redis

r = redis.Redis(decode_responses=True)
PERSIST_INTERVAL = 60  # seconds

def redis_increment(counter_id: str, delta: int = 1) -> int:
    """
    Redis INCRBY is atomic and handles >1M increments/second.
    Returns new value. Persistence to Postgres happens asynchronously.
    """
    return r.incrby(f"counter:{counter_id}", delta)

def redis_get(counter_id: str) -> int:
    val = r.get(f"counter:{counter_id}")
    return int(val) if val is not None else get_count(counter_id)  # fallback to DB

def persist_redis_counters():
    """
    Run every 60 seconds via cron. Flush Redis counter values to Postgres.
    Pattern: read Redis value, write to DB, do NOT delete Redis key.
    Redis is the source of truth for current value; DB is the durable backup.
    """
    cursor = 0
    while True:
        cursor, keys = r.scan(cursor, match='counter:*', count=200)
        for key in keys:
            counter_id = key.removeprefix('counter:')
            value = r.get(key)
            if value is None:
                continue
            db.execute("""
                INSERT INTO Counter (counter_id, value, updated_at)
                VALUES (%s, %s, NOW())
                ON CONFLICT (counter_id) DO UPDATE
                    SET value = EXCLUDED.value, updated_at = NOW()
            """, (counter_id, int(value)))
        if cursor == 0:
            break

Strategy 3: Sharded Counter (Extremely High Write Throughput)

import hashlib

N_SHARDS = 16

def sharded_increment(counter_id: str, shard_key: str, delta: int = 1):
    """
    Distribute writes across N_SHARDS shards to eliminate single-row contention.
    shard_key is any per-request identifier (user_id, request_id) — routes consistently.
    Reads must SUM all shards; writes touch only one row.
    """
    shard_id = int(hashlib.md5(shard_key.encode()).hexdigest()[:4], 16) % N_SHARDS
    db.execute("""
        INSERT INTO CounterShard (counter_id, shard_id, value)
        VALUES (%s, %s, %s)
        ON CONFLICT (counter_id, shard_id) DO UPDATE
            SET value = CounterShard.value + EXCLUDED.value,
                updated_at = NOW()
    """, (counter_id, shard_id, delta))

def sharded_get(counter_id: str) -> int:
    """Sum all shards. Slightly stale OK for view counts / likes."""
    row = db.fetchone("""
        SELECT COALESCE(SUM(value), 0) AS total
        FROM CounterShard WHERE counter_id=%s
    """, (counter_id,))
    return row['total']

# With N_SHARDS=16, write contention on any single row is reduced by 16x.
# Postgres can handle ~5K updates/second per row; sharding gives ~80K/second.

Strategy 4: HyperLogLog for Unique Counts (Cardinality Estimation)

# Count distinct users who viewed a post — exact COUNT(DISTINCT user_id) needs
# O(N) memory. HyperLogLog gives ~2% error with fixed 12KB of memory.

def hll_add(counter_id: str, element: str):
    """Add an element to the HyperLogLog. Redis PFADD is O(1)."""
    r.pfadd(f"hll:{counter_id}", element)

def hll_count(counter_id: str) -> int:
    """Return estimated cardinality. Error rate ~0.81%."""
    return r.pfcount(f"hll:{counter_id}")

# Usage: count unique viewers per post
# hll_add(f"post_views:{post_id}", str(user_id))
# unique_viewers = hll_count(f"post_views:{post_id}")

# Merge multiple HLLs: count unique viewers across all posts by author
# r.pfmerge("hll:author_views:42", *[f"hll:post_views:{pid}" for pid in post_ids])
# total_unique = r.pfcount("hll:author_views:42")

Key Design Decisions

  • Choose the right strategy by write rate and accuracy requirement:
    • <5K writes/sec + must be exact (inventory, billing) → Postgres atomic UPDATE
    • 5K–500K writes/sec + can be slightly stale (likes, views) → Redis INCRBY + periodic persist
    • >500K writes/sec → sharded counter (16 shards × 5K/shard) or Redis cluster
    • Unique counts (distinct visitors) → HyperLogLog (2% error, O(1) memory)
  • Inventory decrement floor: the UPDATE … WHERE value >= delta is atomic — no transaction needed. If rowcount=0, raise a domain error. Never use SELECT then UPDATE for inventory (TOCTOU race).
  • Redis persistence gap: if the Redis server crashes between persist cycles, up to 60 seconds of increments are lost. Acceptable for view counts; not for rate limit counters or inventory. For rate limits, use Redis with AOF persistence (appendonly yes in redis.conf) — every write is fsynced to disk.

Distributed counter and high-throughput counting system design is discussed in Twitter system design interview questions.

Distributed counter and social media like count design is covered in Meta system design interview preparation.

Distributed counter and inventory count system design is discussed in Amazon system design interview guide.

Scroll to Top