User Presence System Low-Level Design: Online Status, Heartbeat, and Real-Time Fan-out

A user presence system tracks whether users are online, offline, or idle — powering the green dots in Slack, WhatsApp, and Google Docs. Core challenges: detecting disconnects reliably (TCP connections can silently die), scaling to millions of concurrent connections, propagating status changes to subscribers without overwhelming the message bus, and respecting user privacy settings.

Core Data Model

-- Presence state (hot path: Redis)
-- Key: presence:{user_id}
-- Value: {"status":"online","last_seen":1713340800,"device":"web"}
-- TTL: 30 seconds (refreshed by heartbeat every 15s; expires = offline)

-- Presence subscription (who wants to know about whom)
CREATE TABLE PresenceSubscription (
    subscriber_id UUID NOT NULL,
    target_id     UUID NOT NULL,
    subscribed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (subscriber_id, target_id)
);
CREATE INDEX idx_presence_target ON PresenceSubscription (target_id);

-- Persistent last-seen (cold path: Postgres, updated on disconnect)
CREATE TABLE UserLastSeen (
    user_id     UUID PRIMARY KEY,
    last_seen   TIMESTAMPTZ NOT NULL,
    status      TEXT NOT NULL DEFAULT 'offline',  -- 'online','away','offline'
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

WebSocket Heartbeat and TTL-Based Offline Detection

import asyncio, json, redis.asyncio as aioredis
from datetime import datetime, timezone

r = aioredis.Redis(host='redis', decode_responses=True)

HEARTBEAT_INTERVAL = 15    # client sends heartbeat every 15s
PRESENCE_TTL = 30          # server expires key after 30s — 1 missed heartbeat = offline
PRESENCE_KEY = "presence:{}"

async def handle_connection(websocket, user_id: str):
    """
    On connection: set presence to online.
    Expect client heartbeat every 15s; refresh TTL on each.
    On disconnect: set offline in Postgres, delete Redis key.
    """
    await set_online(user_id, websocket.client_info)
    try:
        async for message in websocket:
            data = json.loads(message)
            if data.get("type") == "heartbeat":
                await refresh_presence(user_id)
            elif data.get("type") == "set_status":
                # User manually sets away/busy
                await set_status(user_id, data["status"])
    except Exception:
        pass
    finally:
        await set_offline(user_id)

async def set_online(user_id: str, client_info: dict):
    payload = json.dumps({
        "status": "online",
        "last_seen": int(datetime.now(timezone.utc).timestamp()),
        "device": client_info.get("device", "web")
    })
    await r.setex(PRESENCE_KEY.format(user_id), PRESENCE_TTL, payload)
    await publish_presence_change(user_id, "online")

async def refresh_presence(user_id: str):
    """Reset TTL without changing value — O(1) Redis command."""
    key = PRESENCE_KEY.format(user_id)
    payload = await r.get(key)
    if payload:
        await r.expire(key, PRESENCE_TTL)

async def set_offline(user_id: str):
    await r.delete(PRESENCE_KEY.format(user_id))
    await publish_presence_change(user_id, "offline")
    # Persist last-seen to Postgres
    await update_last_seen_db(user_id)

async def publish_presence_change(user_id: str, status: str):
    """Notify all subscribers via Redis pub/sub."""
    event = json.dumps({"user_id": user_id, "status": status,
                        "ts": int(datetime.now(timezone.utc).timestamp())})
    await r.publish(f"presence:changes", event)

Reading Presence and Fan-out to Subscribers

async def get_presence(user_id: str) -> dict:
    """
    Check Redis first (live presence).
    Fall back to Postgres UserLastSeen for recently-offline users.
    """
    raw = await r.get(PRESENCE_KEY.format(user_id))
    if raw:
        data = json.loads(raw)
        data["source"] = "live"
        return data

    # Fall back to DB for last-seen timestamp
    # (simplified: would use async DB client in production)
    return {"status": "offline", "last_seen": None, "source": "db"}

async def get_bulk_presence(user_ids: list[str]) -> dict[str, dict]:
    """Fetch presence for up to 100 users in a single pipeline."""
    keys = [PRESENCE_KEY.format(uid) for uid in user_ids]
    pipeline = r.pipeline()
    for key in keys:
        pipeline.get(key)
    results = await pipeline.execute()

    presence = {}
    for uid, raw in zip(user_ids, results):
        if raw:
            presence[uid] = {**json.loads(raw), "status": "online"}
        else:
            presence[uid] = {"status": "offline"}
    return presence

async def presence_fan_out_worker():
    """
    Subscribe to presence:changes channel.
    For each event, notify subscribers via their WebSocket connections.
    """
    pubsub = r.pubsub()
    await pubsub.subscribe("presence:changes")
    async for message in pubsub.listen():
        if message["type"] != "message":
            continue
        event = json.loads(message["data"])
        user_id = event["user_id"]
        # Find all subscribers interested in this user's presence
        # (query PresenceSubscription or use a Redis set: subscribers:{user_id})
        subscriber_ids = await get_subscribers(user_id)
        for sub_id in subscriber_ids:
            await push_to_websocket(sub_id, event)

Privacy and Status Controls

async def set_status(user_id: str, requested_status: str):
    """
    Allow users to set: online, away, busy, invisible.
    Invisible: user appears offline to others but can still use the app.
    """
    VALID_STATUSES = {"online", "away", "busy", "invisible"}
    if requested_status not in VALID_STATUSES:
        raise ValueError(f"Invalid status: {requested_status}")

    key = PRESENCE_KEY.format(user_id)
    raw = await r.get(key)
    if not raw:
        return  # Not connected — ignore

    data = json.loads(raw)
    data["status"] = requested_status

    # Invisible: store true status in Redis but broadcast "offline" to subscribers
    broadcast_status = "offline" if requested_status == "invisible" else requested_status
    await r.setex(key, PRESENCE_TTL, json.dumps(data))
    await publish_presence_change(user_id, broadcast_status)

Key Interview Points

  • TTL as implicit offline detection: TCP connections can die silently — a client behind a NAT or mobile network may lose connectivity without sending a FIN. Setting a Redis key with a 30s TTL and refreshing it every 15s means a missed heartbeat causes the key to expire naturally, broadcasting “offline” without needing the server to detect the dead connection explicitly.
  • Multi-device presence: A user on both desktop and mobile should show “online” until both devices go offline. Use a per-device key (presence:{user_id}:{device_id}) and a counter: INCR connected:{user_id} on connect, DECR on disconnect. User is “online” if the counter > 0. The counter lives in Redis with no TTL — decremented on disconnect and deleted when it reaches 0.
  • Subscriber fan-out scaling: A celebrity with 1M followers sends a presence change that must fan-out to 1M subscribers. Strategies: (1) subscribe only to contacts/friends, not all followers (most apps); (2) cap presence subscriptions at 500 per user; (3) lazy fan-out — clients poll presence on demand rather than receiving push updates. Twitter-style apps don’t show real-time presence for non-contacts.
  • Redis keyspace notifications: Enable keyspace notifications (CONFIG SET notify-keyspace-events KEgx) to receive a pub/sub event when a presence key expires (user went offline without explicit disconnect). Subscribe to __keyevent@0__:expired and filter for presence:* keys. This allows the fan-out worker to detect TTL expirations and broadcast “offline” events.
  • Last-seen granularity: WhatsApp shows “last seen today at 3:42pm”. Store UserLastSeen.last_seen in Postgres (updated on each disconnect). For privacy: let users control visibility — “nobody”, “contacts only”, “everyone”. Cache the setting per user in Redis to avoid a DB lookup on every presence fetch.

User presence and real-time status system design is discussed in Snap system design interview questions.

User presence and professional network status design is covered in LinkedIn system design interview preparation.

User presence and messaging system design is discussed in Meta system design interview guide.

Scroll to Top