Q: How does message ordering work across devices in a distributed chat system?

Message ordering has two components: sender-to-server ordering (ensuring messages from one sender arrive in order) and global conversation ordering (all participants see messages in the same order). Sender-to-server: each message includes a client_sequence_number incremented per conversation per device. The server detects gaps (if sequence 5 arrives before 4, hold 5 until 4 arrives or a timeout). This handles out-of-order TCP segments (rare but possible). Server-to-database: Kafka partitioning by conversation_id ensures all messages for a conversation are processed by a single consumer, preserving order. Message IDs use Snowflake (timestamp + server_id + sequence) — monotonically increasing, so insertion order = chronological order in Cassandra. Global ordering: all clients fetch messages from Cassandra sorted by message_id. Since all messages go through a single Kafka partition per conversation, there is a total order. The "last seen message_id" cursor ensures clients fetch exactly the messages they missed, in order, with no duplicates or gaps.

Question 1

How do you route a message to the correct gateway server holding the recipient's WebSocket connection?

Accepted Answer

With thousands of gateway servers each holding different client connections, the message router needs to know which server has the recipient connected. Solution: maintain a presence/routing table in Redis. When a client connects: SETEX conn:{user_id} 300 "gateway-server-42" (TTL refreshed on each heartbeat). When a message arrives for user B: message service does GET conn:{user_id} → "gateway-server-42". It then makes an internal gRPC call to gateway-server-42: DeliverMessage(user_id=B, message=...). Gateway-server-42 looks up user B's WebSocket connection in its local connection map and sends the message. If GET conn:{user_id} returns nil (user offline): store the message in Cassandra and send a push notification (APNs/FCM) instead. This architecture keeps gateway servers stateless from the routing perspective — all routing state lives in Redis, not in individual servers. Adding new gateway servers is just adding more connection capacity; no re-routing is needed for existing connections.

Question 2

How does Cassandra's data model support efficient message history retrieval for chat?

Accepted Answer

Chat message retrieval has two access patterns: (1) "Load the most recent 50 messages for conversation X" — happens on every conversation open. (2) "Load the next 50 older messages" — infinite scroll backward. Cassandra data model: PRIMARY KEY (conversation_id, message_id) CLUSTERING ORDER BY message_id DESC. This stores all messages for a conversation in a single partition, sorted by message_id in descending order. Query (1): SELECT * FROM messages WHERE conversation_id = X LIMIT 50 — returns the 50 most recent messages in O(1) (single partition, no scatter-gather). Query (2): SELECT * FROM messages WHERE conversation_id = X AND message_id < last_cursor LIMIT 50 — cursor-based pagination. This design works because Cassandra partitions are optimized for sequential reads within a partition key. The message_id should be a time-ordered UUID (Snowflake ID or UUIDv1) so descending order = chronological reverse order. Partitions can grow large (a popular group chat with millions of messages) — Cassandra handles wide rows well, but consider time-bucketing (conversation_id, year_month, message_id) for extremely active conversations.

Question 3

How does message ordering work across devices in a distributed chat system?

Accepted Answer

Message ordering has two components: sender-to-server ordering (ensuring messages from one sender arrive in order) and global conversation ordering (all participants see messages in the same order). Sender-to-server: each message includes a client_sequence_number incremented per conversation per device. The server detects gaps (if sequence 5 arrives before 4, hold 5 until 4 arrives or a timeout). This handles out-of-order TCP segments (rare but possible). Server-to-database: Kafka partitioning by conversation_id ensures all messages for a conversation are processed by a single consumer, preserving order. Message IDs use Snowflake (timestamp + server_id + sequence) — monotonically increasing, so insertion order = chronological order in Cassandra. Global ordering: all clients fetch messages from Cassandra sorted by message_id. Since all messages go through a single Kafka partition per conversation, there is a total order. The "last seen message_id" cursor ensures clients fetch exactly the messages they missed, in order, with no duplicates or gaps.

System Design Interview: Design a Real-Time Chat Application (WhatsApp/Slack)

What Is a Real-Time Chat System?

System Requirements

Functional

Non-Functional

Connection Architecture

Message Flow

Message Storage: Cassandra

Message Ordering and Deduplication

Presence Service

Offline Message Delivery

Group Messaging

Media Storage

Interview Tips