Question 1

How do you deliver real-time score updates to 50 million concurrent users?

Accepted Answer

Use a push-based architecture: WebSocket or Server-Sent Events (SSE) connections from clients to a gateway layer. The gateway subscribes to Redis pub/sub channels (one per game). When a score event arrives via Kafka, it updates Redis game state and publishes to the channel. All gateway nodes subscribed to that channel immediately broadcast to their connected clients. This pushes updates in under 1 second without clients polling. For clients that cannot maintain persistent connections, fall back to HTTP polling against a Redis-cached endpoint.

Question 2

How do you handle the traffic spike when a goal is scored?

Accepted Answer

Five mitigations: (1) WebSocket push eliminates polling spikes for connected clients. (2) CDN edge caching with stale-while-revalidate serves HTTP polling requests from edge, absorbing the spike before it hits origin. (3) Probabilistic Early Revalidation (PER) prevents thundering herd on cache expiry by refreshing before TTL elapses. (4) Client-side random jitter (0-2 seconds) desynchronizes polling retries. (5) Push notifications sent via a separate fan-out service (not blocking the score update path) to avoid backpressure on the ingestion pipeline.

Question 3

How do you ingest score updates reliably from official data providers?

Accepted Answer

Official providers (Sportradar, Stats Perform) push events via webhooks or persistent TCP feeds. Ingestion service validates, normalizes, and writes to Kafka. Events include provider-assigned sequence numbers for ordering and deduplication. Kafka consumer processes in order and applies events with ON CONFLICT DO NOTHING idempotency. If the feed disconnects and reconnects, duplicate events are safely ignored. The Kafka topic acts as a buffer, decoupling ingestion rate from downstream processing speed.

Question 4

What is the data model for a live sports game?

Accepted Answer

Game table: game_id, sport, home/away team IDs, status (SCHEDULED/LIVE/FINAL), home/away scores, current period, clock. GameEvent table: event_id, game_id, event_type (GOAL/TOUCHDOWN/FOUL/SUBSTITUTION), team_id, player_id, clock, period, description, created_at. Events are immutable and append-only - they form the source of truth from which scores and stats are derived. This event-sourced approach enables replaying game history, computing in-game stats, and powering highlight feeds.

Question 5

How do you scale standings and leaderboard queries during peak traffic?

Accepted Answer

Pre-compute and cache standings tables and top-N leaderboards in Redis with a 60-second TTL. During peak events, these endpoints are read millions of times per minute - scanning the full database is not feasible. A background job recomputes standings after each game completes and writes to both the database and the cache. For real-time leaderboards during a game (e.g., top scorers), maintain Redis sorted sets updated on each GameEvent, readable in O(log n + k) per query.

System Design: Live Sports Score Platform — Real-Time Updates, Fan Scale, and Stats (2025)

Requirements and Scale

Data Model

Ingestion Pipeline – Official Data Feeds

Fan-Side Read Architecture

Handling Score Spikes – The Goal Effect

Historical Stats and Standings