System Design Interview: Design a Chat Application (WhatsApp)

System Design Interview: Design a Chat Application (WhatsApp)

Designing a real-time messaging application like WhatsApp is a comprehensive system design question covering WebSocket connections, message persistence, delivery receipts, end-to-end encryption, and online presence. Asked at Meta, LinkedIn, Slack, and Discord.

Requirements Clarification

Functional Requirements

  • One-to-one messaging with real-time delivery
  • Group chats (up to 256 members)
  • Message delivery receipts: sent, delivered, read (single/double/blue checkmarks)
  • Online/last-seen status
  • Media sharing: images, videos, documents
  • Message history: access past messages on new devices
  • Push notifications when app is offline

Non-Functional Requirements

  • Scale: 2B users, 100B messages/day
  • Latency: message delivery <100ms for online users
  • Availability: 99.99% (users expect messaging to always work)
  • Durability: messages must not be lost

Core Architecture: WebSocket Connections

Real-time messaging requires persistent connections. HTTP request/response is too slow (high latency, high overhead). WebSocket provides full-duplex communication over a single TCP connection:

Client <--WebSocket--> Chat Server
  - Client connects on app open
  - Server can push messages instantly
  - Heartbeat every 30s to detect disconnections
  - Client reconnects on disconnect

High-Level Architecture

Users
  |
Load Balancer (WebSocket-aware, sticky sessions)
  |
Chat Servers (stateful: maintain WebSocket connections)
  |
Message Queue (Kafka)
  |
Message Processor
  - Fan-out to recipient's chat server
  - Persist to message DB
  - Trigger push notification if offline
  |
Message DB (Cassandra)    Media Store (S3 + CDN)
  |
Presence Service          Push Notification Service
(Redis)                   (FCM, APNs)

Message Delivery Flow

Alice sends message to Bob:

1. Alice's app sends message over WebSocket to Chat Server A
2. Chat Server A publishes to Kafka: {from:alice, to:bob, content, msg_id, ts}
3. Message Processor consumes from Kafka:
   a. Persist message to Cassandra
   b. Look up which Chat Server Bob is connected to (via Redis hash)
   c. Forward message to Chat Server B
4. Chat Server B pushes message to Bob over WebSocket
5. Bob's client sends ACK
6. Chat Server B updates delivery status: "delivered"
7. Alice receives delivered receipt

If Bob is offline:
   Step 3b: Bob is not connected
   Step 3c: Send push notification via FCM/APNs
   Bob opens app -> WebSocket connects -> fetches unread messages from Cassandra

Message Storage (Cassandra)

# Schema optimized for conversation queries
messages_by_conversation:
  partition_key: (conversation_id)
  clustering_key: (message_id DESC)  -- newest first
  columns: sender_id, content, content_type, status, created_at

# conversation_id = min(user_a, user_b) + "_" + max(user_a, user_b) for 1:1
# conversation_id = group_id for group chats

# Query: last 50 messages in a conversation
SELECT * FROM messages_by_conversation
WHERE conversation_id = ?
ORDER BY message_id DESC
LIMIT 50;

Message IDs are time-ordered (Snowflake or ULID) for correct sort order and efficient range scans. Cassandra is ideal: high write throughput, time-series access pattern, easy sharding by conversation_id.

Online Presence

# Redis hash: user_id -> {server_id, last_heartbeat}
HSET presence:user123 server_id "chat-server-5" last_seen 1716000000

# Heartbeat every 30s (client pings server)
# TTL: 60s (auto-expire if no heartbeat = offline)

# On disconnect: remove from Redis
HDEL presence:user123

# To find Bob's server:
HGET presence:bob server_id  # returns "chat-server-5" or nil (offline)

# Last seen: store timestamp on disconnect
SET last_seen:bob {timestamp} EX 2592000  # 30 day TTL

Group Messaging Fan-out

Group with 256 members: when Alice sends a message, it must be delivered to 255 other members. Fan-out strategies:

  • Fan-out on send: message processor looks up all group members, sends to each member’s chat server. 256 server lookups + 256 deliveries per message. OK for groups up to 1000 members.
  • Fan-out on receive: store one message copy, each client pulls on reconnect. More efficient for large groups but higher read latency.
  • Hybrid: WhatsApp groups are small (256 max) – fan-out on send is used.

Delivery Receipts

Message status state machine:
SENT (stored in Cassandra) -> DELIVERED (recipient's device received) -> READ (recipient opened chat)

# Single check: sent to server
# Double check: delivered to device
# Blue check: read by recipient

Implementation:
1. Delivered: recipient's device sends delivery ACK when message arrives
   Server updates message status to DELIVERED
2. Read: recipient's client sends read receipt when user opens the conversation
   Server updates status to READ and notifies sender via WebSocket

Media Sharing

  • Client uploads media directly to S3 via pre-signed URL (bypass chat servers)
  • Client sends message with media URL (and thumbnail) via WebSocket
  • Recipient downloads media from CDN URL
  • End-to-end encrypted: client encrypts media before upload; key shared only with recipients
  • Expiry: media stored for 30-90 days; permanent for saved media

Interview Tips

  • Lead with WebSocket – explain why HTTP polling is insufficient
  • Explain the connection registry (Redis hash: user_id to server_id)
  • Describe Cassandra schema – partition by conversation, cluster by message_id DESC
  • Cover delivery receipts state machine (sent/delivered/read)
  • Discuss group fan-out: at WhatsApp scale (256 max members) fan-out on send works
  • Push notifications for offline users via FCM/APNs

Scroll to Top