Question 1

Why use Server-Sent Events instead of WebSocket for a live comment feed?

Accepted Answer

SSE is a one-way (server → client) protocol over plain HTTP. It uses the browser's built-in EventSource API which handles reconnection automatically with the Last-Event-ID header. SSE works through standard HTTP load balancers and proxies without special configuration. WebSocket is bidirectional — use it when clients need to send data back on the same connection (live typing, collaborative editing). For a read-only comment stream where the user only submits comments via regular POST requests, SSE is simpler, uses less memory per connection, and has automatic reconnect built in.

Question 2

How does Redis Pub/Sub eliminate the N×M polling problem?

Accepted Answer

Without Redis: each of M app servers polls the database every second for new comments on each active content item. With N content items and M servers, this is N×M queries/second — a database-killing load for popular live events. With Redis Pub/Sub: when a new comment is posted, publish one message to a Redis channel (comments:{content_id}). Each app server subscribes to only the channels for content their connected SSE clients are watching. Redis delivers the message to all subscribed servers in one operation. The database load becomes one write per new comment rather than N×M reads per second.

Question 3

How does the Last-Event-ID header enable comment gap recovery?

Accepted Answer

Every SSE message includes an id field (id: {comment_id}). The browser stores the last received id in memory. On reconnection (network drop, server restart), the browser automatically sends the Last-Event-ID header with this stored id. The server queries: SELECT * FROM Comment WHERE content_id=X AND id > last_event_id ORDER BY id to replay any comments received during the disconnection gap. Without this mechanism, users who experience a brief network interruption would silently miss comments that arrived while they were disconnected.

Question 4

How do you limit live comment load for extremely popular events?

Accepted Answer

At 100,000 concurrent viewers, a new comment every second generates 100,000 SSE writes/second across all connections. Strategies: (1) Throttle: batch comments and send updates every 500ms instead of per-comment — reduces writes to 200K/sec×50%=2x savings. (2) Sample: at very high rates, only show a subset of comments ("showing every 10th comment during peak traffic"). (3) Separate hot event infrastructure: route high-traffic events to a dedicated real-time service (Pusher, Ably) that is designed for this workload. (4) Use WebSocket multiplexing — one connection per server handling thousands of clients.

Question 5

How do you prevent spam in live comment streams?

Accepted Answer

Rate limit comment submission per user: max 3 comments per 10 seconds using Redis INCR with TTL. Require authentication for live comment submission — anonymous posting is the main spam vector. Filter known spam patterns server-side before inserting. Queue comments through a fast keyword filter before fan-out: exact-match block lists take microseconds. For high-profile events: implement a short moderation queue (5-second hold before display) so moderators can remove spam before it reaches viewers. Display "sent" confirmation to the poster immediately but delay actual fan-out.

Live Comments System Low-Level Design

Live Comments System — Low-Level Design

Connection Architecture Options

Core Data Model

SSE Endpoint Implementation

Scaling with Redis Pub/Sub

Fan-Out at Scale

Client Reconnection and Missed Comments

Key Interview Points