Messaging System (Chat) Low-Level Design

Requirements

  • One-on-one messaging and group chats (up to 500 members)
  • Real-time message delivery (< 100ms), offline message storage, read receipts
  • Message ordering guaranteed per conversation
  • 10M DAU, 100M messages/day

Data Model

Conversation(conv_id, type ENUM(DM,GROUP), created_at, last_message_id)
Participant(conv_id, user_id, joined_at, last_read_message_id)
Message(message_id BIGINT, conv_id, sender_id, body TEXT, type ENUM(TEXT,IMAGE,FILE),
        created_at, client_msg_id UUID)  -- client_msg_id for deduplication
MessageMedia(media_id, message_id, url, mime_type, size_bytes)

message_id uses a monotonically increasing sequence per conversation (not global) to ensure ordering. Use a Snowflake-style ID or a per-conv counter in Redis.

Real-Time Delivery with WebSockets

Each client maintains a persistent WebSocket connection to a Chat Server. When Alice sends a message to Bob:

  1. Alice’s client sends the message over WebSocket to her Chat Server
  2. Chat Server stores the message in DB (assign message_id)
  3. Publish to a Pub/Sub channel (Redis Pub/Sub or Kafka topic per conversation)
  4. Bob’s Chat Server subscribes to that channel, receives the message, pushes to Bob’s WebSocket
  5. If Bob is offline: Chat Server stores in a pending_messages queue; delivers on reconnect

Chat Server Architecture

  • Stateful servers: each server holds WebSocket connections for a set of users. Connection registry in Redis: HSET connections {user_id} {server_id}. TTL refreshed on heartbeat (30s).
  • Routing: to find Bob’s Chat Server, look up connections:{bob_id}. Forward message to that server via HTTP or internal message bus.
  • Group chats: fan-out to all online participants. For 500-member groups, fan-out to at most 500 Chat Servers. Offline members get the message queued.

Message Storage and Retrieval

Messages are write-heavy and read-once (users scroll back occasionally). Store in Cassandra (wide-column) sharded by conv_id:

CREATE TABLE messages (
    conv_id    UUID,
    message_id BIGINT,
    sender_id  UUID,
    body       TEXT,
    created_at TIMESTAMP,
    PRIMARY KEY (conv_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);

Partition key = conv_id ensures all messages for a conversation are on the same node. Clustering by message_id DESC enables efficient pagination (latest-first). MySQL works for smaller scale but Cassandra handles millions of conversations without hot partitions.

Offline Message Delivery

When a user reconnects, they send their last_seen_message_id per conversation. The server queries: SELECT * FROM messages WHERE conv_id=X AND message_id > last_seen LIMIT 100. This delivers missed messages in order. For push notifications (APNs/FCM) while app is backgrounded: a separate notification service receives new message events from Kafka and sends push payloads.

Read Receipts

Store last_read_message_id per (user, conversation) in the Participant table. When Alice opens a conversation, UPDATE Participant SET last_read_message_id=latest_id. To compute unread count: SELECT COUNT(*) FROM messages WHERE conv_id=X AND message_id > alice.last_read_message_id. Cache unread counts in Redis per user. Fan out read receipt events to other participants over WebSocket so they see “✓✓”.

Message Deduplication

Clients generate a UUID client_msg_id before sending. If the send request times out (network retry), the client resends with the same client_msg_id. The server does: INSERT INTO messages … WHERE NOT EXISTS (SELECT 1 FROM messages WHERE client_msg_id=X). Idempotent insert — duplicate message is silently dropped. The server returns the assigned message_id from the first insert.

Key Design Decisions

  • WebSocket + Pub/Sub for real-time; push notifications for background delivery
  • Connection registry in Redis to route messages across Chat Servers
  • Cassandra partitioned by conv_id for write-scalable message storage
  • client_msg_id deduplication prevents duplicates on network retry
  • Per-conversation message_id sequence for ordering without global coordination


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you deliver messages in real time in a chat system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Each client maintains a persistent WebSocket connection to a Chat Server. On message send: (1) Client sends message over WebSocket. (2) Chat Server writes message to DB (assign message_id). (3) Chat Server publishes to a Pub/Sub channel (Redis Pub/Sub keyed by conv_id, or a Kafka topic). (4) The recipient's Chat Server (potentially a different machine) subscribes to that channel, receives the event, and pushes to the recipient's WebSocket. For offline recipients: the message is stored in DB; on reconnect, the client sends its last_seen_message_id and receives all missed messages. Push notifications (APNs/FCM) are sent for backgrounded apps via a separate Notification Service that subscribes to the same Kafka events.”}},{“@type”:”Question”,”name”:”How do you route messages between different Chat Servers?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Maintain a connection registry in Redis: HSET connections {user_id} {server_id}, with TTL refreshed on heartbeat (every 30 seconds). When Server A receives a message for user B: look up connections:{user_b_id} to find Server B's ID, then forward the message to Server B via HTTP or internal gRPC call. Server B pushes to user B's WebSocket. If user B is not in the registry (offline): skip the forwarding step; the message is already persisted in DB and will be delivered on reconnect. For group chats: fan-out to all online participant server IDs found in the registry. For 500-member groups, this is at most 500 inter-server forwards per message.”}},{“@type”:”Question”,”name”:”Why is Cassandra a good choice for storing chat messages?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Chat messages are write-heavy (every message is an insert), mostly read once (users scroll back occasionally), and naturally partitioned by conversation. Cassandra's wide-column model fits perfectly: partition key = conv_id (all messages for a conversation on one node), clustering key = message_id DESC (read latest-first). Write throughput: Cassandra handles tens of thousands of writes per second per node without write amplification. Queries: SELECT * FROM messages WHERE conv_id=X AND message_id > last_seen LIMIT 100 — single-partition range scan, extremely fast. Trade-offs: Cassandra does not support secondary indexes efficiently (no querying by sender_id or timestamp globally). Store conversation metadata (participants, last_message) in a separate MySQL table.”}},{“@type”:”Question”,”name”:”How do you implement read receipts in a messaging system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Store last_read_message_id per (user_id, conv_id) in the Participant table. When the user opens a conversation: UPDATE Participant SET last_read_message_id = latest_message_id. To count unread: SELECT COUNT(*) FROM messages WHERE conv_id=X AND message_id > participant.last_read_message_id, or precompute via Redis counter. For delivery receipts (message delivered to device): the recipient's client sends a delivery acknowledgment over WebSocket on receipt; the server updates a MessageDelivery record. For "seen by" in group chats: fan-out the read event to all participants. Cache unread counts per user in Redis (HSET unread:{user_id} {conv_id} {count}) for fast inbox display — decrement on read, increment on new message receipt.”}},{“@type”:”Question”,”name”:”How do you prevent duplicate messages in a chat system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Clients generate a UUID client_msg_id before sending. If the send request times out and the client retries, it sends the same client_msg_id. The server performs an idempotent insert: INSERT INTO messages (conv_id, sender_id, body, client_msg_id, …) … ON CONFLICT (client_msg_id) DO NOTHING (PostgreSQL) or equivalent. The server always returns the assigned message_id from the first successful insert. The client can then match its sent message by client_msg_id to the confirmed message_id. Without deduplication, network retries cause the same message to appear twice in the conversation. Store client_msg_id in a unique index in the DB or in a Redis set with TTL (24h) for faster duplicate checks before hitting the DB.”}}]}

Meta Messenger and WhatsApp are canonical chat system design topics. See common questions for Meta interview: messaging and chat system design.

Snap system design covers real-time messaging and presence. Review design patterns for Snap interview: real-time messaging system design.

LinkedIn system design covers messaging and real-time notifications. See patterns for LinkedIn interview: messaging and notification system design.

Scroll to Top