Requirements and Scale
Functional: real-time score updates ( millions of simultaneous reads). Read-to-write ratio: 100,000:1. The core challenge is fan-side read amplification: one score update triggers millions of simultaneous reads.
Data Model
Game: game_id, sport, home_team_id, away_team_id, status (SCHEDULED, LIVE, FINAL), home_score, away_score, current_period (quarter/half/period), clock_seconds, venue, start_time, end_time. GameEvent: event_id, game_id, event_type (GOAL, TOUCHDOWN, FOUL, SUBSTITUTION, YELLOW_CARD, TIMEOUT, PERIOD_END), team_id, player_id, clock_seconds, period, description, created_at. Team: team_id, sport, name, city, logo_url, conference, division. Player: player_id, team_id, name, number, position, stats (JSONB: sport-specific stats). Standing: team_id, season, wins, losses, ties, points, rank (updated after each game completion).
Ingestion Pipeline – Official Data Feeds
Official sports data providers (Sportradar, Stats Perform, ESPN) push events via webhooks or persistent TCP feeds. Ingestion service: validates and normalizes incoming events, writes to Kafka topic (game-events), publishes to game-specific Redis channels. Kafka consumer: updates Game record (score, clock), appends GameEvent record, invalidates cached game state. Event ordering: events include a sequence_number from the provider; consumer deduplicates and orders by sequence before applying. Idempotency: event_id is provider-assigned; ON CONFLICT DO NOTHING on insert prevents duplicates from feed reconnects.
Fan-Side Read Architecture
# The challenge: 50M fans polling for the same game state
# Solution: push (WebSocket/SSE) + aggressive caching
# Game state cached in Redis (updated by Kafka consumer on each event)
# Key: game:{game_id}:state
# Value: JSON {home_score, away_score, clock, period, last_event}
# TTL: none for live games, 1h after final
# WebSocket gateway (stateful, per-game subscriptions)
class ScoreHub:
def subscribe(self, connection_id: str, game_id: str):
self.redis_pubsub.subscribe(f"game:{game_id}:updates")
self.connections[game_id].add(connection_id)
def on_game_event(self, game_id: str, event: dict):
# Broadcast to all subscribed WebSocket connections
# Gateway holds 10K-100K connections per node
for conn_id in self.connections[game_id]:
self.send(conn_id, event)
# Scale: horizontal WebSocket gateway nodes, each holding subset of connections
# Sticky sessions (connection to same node via consistent hash on connection_id)
# Node-to-node event fanout via Redis pub/sub
# HTTP fallback (polling) for clients that cannot maintain WebSocket:
# GET /games/{game_id}/state -> served from Redis cache (no DB hit)
# Cache-Control: no-cache (must revalidate) but ETag-based 304s
# Client polls every 5s -> Redis hit in < 1ms, handles 50M/5 = 10M req/min
Handling Score Spikes – The Goal Effect
When a goal is scored, all 50M fans simultaneously react: app foregrounds, rapid polling, push notification click-through. Traffic spike: 50x normal within 5 seconds. Mitigation: (1) Push first: WebSocket/SSE connected clients receive the update instantly – no polling spike for them. (2) CDN edge caching: game state endpoint cached at CDN with stale-while-revalidate; CDN serves the spike while origin refreshes. (3) Thundering herd protection: Redis cache with probabilistic early expiration (PER algorithm) refreshes cache before it expires to prevent simultaneous cache misses. (4) Client-side jitter: if polling, clients add random 0-2s jitter to avoid synchronized requests. (5) Push notifications: sent via APNs/FCM through a separate notification service to avoid coupling to the score update path; fan-out queue for 50M device tokens processed in parallel by notification workers.
Historical Stats and Standings
Live stats computed from GameEvent stream in real time (Flink/Spark Streaming): player totals per game (goals, assists, yards, tackles). Aggregated season stats updated after each game completion (batch job). Standings recomputed after each game: UPDATE standings SET wins=wins+1… within a transaction on game status -> FINAL transition. Historical query performance: partition game_events by game_id and date. Compound index on (player_id, season) for player season stats lookup. Pre-compute and cache top-N leaderboards (top scorers, standings table) with 60-second TTL – these are read millions of times per minute during peak.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you deliver real-time score updates to 50 million concurrent users?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a push-based architecture: WebSocket or Server-Sent Events (SSE) connections from clients to a gateway layer. The gateway subscribes to Redis pub/sub channels (one per game). When a score event arrives via Kafka, it updates Redis game state and publishes to the channel. All gateway nodes subscribed to that channel immediately broadcast to their connected clients. This pushes updates in under 1 second without clients polling. For clients that cannot maintain persistent connections, fall back to HTTP polling against a Redis-cached endpoint.”}},{“@type”:”Question”,”name”:”How do you handle the traffic spike when a goal is scored?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Five mitigations: (1) WebSocket push eliminates polling spikes for connected clients. (2) CDN edge caching with stale-while-revalidate serves HTTP polling requests from edge, absorbing the spike before it hits origin. (3) Probabilistic Early Revalidation (PER) prevents thundering herd on cache expiry by refreshing before TTL elapses. (4) Client-side random jitter (0-2 seconds) desynchronizes polling retries. (5) Push notifications sent via a separate fan-out service (not blocking the score update path) to avoid backpressure on the ingestion pipeline.”}},{“@type”:”Question”,”name”:”How do you ingest score updates reliably from official data providers?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Official providers (Sportradar, Stats Perform) push events via webhooks or persistent TCP feeds. Ingestion service validates, normalizes, and writes to Kafka. Events include provider-assigned sequence numbers for ordering and deduplication. Kafka consumer processes in order and applies events with ON CONFLICT DO NOTHING idempotency. If the feed disconnects and reconnects, duplicate events are safely ignored. The Kafka topic acts as a buffer, decoupling ingestion rate from downstream processing speed.”}},{“@type”:”Question”,”name”:”What is the data model for a live sports game?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Game table: game_id, sport, home/away team IDs, status (SCHEDULED/LIVE/FINAL), home/away scores, current period, clock. GameEvent table: event_id, game_id, event_type (GOAL/TOUCHDOWN/FOUL/SUBSTITUTION), team_id, player_id, clock, period, description, created_at. Events are immutable and append-only – they form the source of truth from which scores and stats are derived. This event-sourced approach enables replaying game history, computing in-game stats, and powering highlight feeds.”}},{“@type”:”Question”,”name”:”How do you scale standings and leaderboard queries during peak traffic?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Pre-compute and cache standings tables and top-N leaderboards in Redis with a 60-second TTL. During peak events, these endpoints are read millions of times per minute – scanning the full database is not feasible. A background job recomputes standings after each game completes and writes to both the database and the cache. For real-time leaderboards during a game (e.g., top scorers), maintain Redis sorted sets updated on each GameEvent, readable in O(log n + k) per query.”}}]}
Twitter system design interviews cover real-time event feeds at scale. See common questions at Twitter/X interview: real-time feeds and live event scaling.
Snap interviews cover real-time media delivery at scale. See design patterns for Snap interview: real-time media and live event systems.
Databricks interviews test stream processing and real-time analytics. See common patterns for Databricks interview: real-time analytics and stream processing.