Real-Time Analytics Platform Low-Level Design

Requirements

  • Ingest billions of events per day (page views, clicks, transactions, errors)
  • Support real-time dashboards with <10 second lag
  • Support ad-hoc historical queries (e.g., weekly active users, funnel analysis)
  • 1M events/second at peak

Lambda Architecture

Events → Kafka → Stream Layer  → Real-time serving layer (Druid/Redis)
                             → Batch Layer    → Historical DWH (BigQuery/Snowflake)
                                              → Serving layer (materialized views)

Lambda architecture: stream layer for real-time (last 24h), batch layer for historical (all-time). Kappa architecture alternative: process everything through the stream layer with long retention Kafka. Simpler to operate but requires streaming SQL for complex aggregations.

Event Ingestion

Event(event_id UUID, event_type, user_id, session_id, properties JSON,
      client_timestamp, server_timestamp, app_version, platform)

Client SDKs batch events locally (10 events or 5 seconds, whichever first) and send to the Collector API. Collector validates, enriches (add server_timestamp, geolocation from IP, device type from user-agent), and publishes to Kafka. Return 200 immediately — no processing in the hot path. Client SDK uses exponential backoff retry on failure. Kafka topic partitioned by user_id for ordered per-user processing.

Stream Processing (Flink / Kafka Streams)

Real-time aggregations computed on the stream:

  • Windowed counts: COUNT events per (event_type, 1-minute window). Write to Druid or ClickHouse.
  • Unique users: HyperLogLog approximation (Redis PFADD per (event_type, hour)) for distinct user counts without storing all user IDs.
  • Funnel steps: sessionization — group events by session_id within a 30-minute window. For each session, track which funnel steps were completed.
  • Anomaly detection: compare current metric against rolling average; alert on >3 standard deviation deviation.

Real-Time Serving (Druid / ClickHouse)

Druid: append-only columnar database optimized for time-series aggregation queries. Ingests from Kafka in real time, serves sub-second GROUP BY queries on billions of rows. Schema: timestamp, dimensions (event_type, platform, country), metrics (count, sum, cardinality). Query: SELECT COUNT(*) FROM events WHERE __time BETWEEN NOW()-1h AND NOW() AND event_type=’page_view’ GROUP BY platform.

ClickHouse: similar but simpler to operate. MergeTree engine, columnar storage, vectorized query execution. Excellent for ad-hoc analytics queries.

Batch Processing (Data Warehouse)

Kafka → S3 (raw event archive, Parquet format, partitioned by date) → Spark/dbt → BigQuery/Snowflake. Batch jobs run hourly. Produce: daily active users (DAU), weekly active users (WAU), retention cohorts, conversion funnels, revenue attribution. These require complex joins and multi-day windows not suited for streaming. Results stored in DWH tables, queryable by data analysts via SQL and BI tools (Looker, Superset).

HyperLogLog for Unique User Counts

Exact distinct counts require storing all user IDs — O(n) memory. HyperLogLog (HLL) approximates distinct counts with ~1% error using O(log log n) memory. Redis PFADD adds elements to an HLL sketch; PFCOUNT returns the approximate cardinality. PFMERGE combines sketches (for combining hourly → daily). Use HLL for: daily active users, weekly active users, unique visitors per page. When exact counts are required (billing), use a bitmap (Redis BITSET) for known user ID ranges.

Dashboard Real-Time Updates

Dashboards poll every 10 seconds via REST or subscribe via WebSocket to a metrics stream. The real-time serving layer (Druid/ClickHouse) handles concurrent dashboard queries efficiently due to columnar storage and pre-aggregated segments. Pre-compute common dashboard queries on a schedule (every minute) and cache results in Redis — dashboards serve from cache, reducing query load. Alert thresholds monitored by a separate alerting service that queries the stream layer every minute.

Key Design Decisions

  • Kafka as the central event bus — decouples ingestion from all downstream processing
  • Lambda architecture: stream for real-time (Druid), batch for historical (DWH)
  • HyperLogLog for approximate unique counts at scale — exact counting requires too much memory
  • Columnar storage (Druid, ClickHouse, BigQuery) is essential for aggregation query performance on billions of rows
  • Client-side batching reduces ingestion API load by 10-100x vs per-event HTTP calls


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the Lambda architecture for analytics?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Lambda architecture uses two parallel processing layers: the Speed Layer processes events in real time (last few hours/days) using stream processing (Flink, Kafka Streams), serving low-latency queries with slight approximation. The Batch Layer reprocesses all historical data periodically (hourly/daily) using batch processing (Spark), producing accurate but higher-latency results stored in a data warehouse. The Serving Layer merges results from both layers. Use Lambda when: you need real-time metrics AND accurate historical analysis, the two have different latency/accuracy trade-offs. Kappa architecture alternative: process everything through a single stream layer with long-retention Kafka. Simpler to operate (no dual code paths) but harder to implement complex historical analyses that require multiple passes over data.”}},{“@type”:”Question”,”name”:”How does Kafka act as the backbone of an analytics pipeline?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Kafka is the central event bus that decouples producers (apps sending events) from consumers (analytics pipelines). Events are written once by producers and consumed independently by multiple consumers: stream processor (Flink/Kafka Streams) for real-time aggregations, S3 sink connector for raw event archival, data warehouse loader for batch processing, alerting service for anomaly detection. Kafka retains events for configurable retention (7 days default, longer for replay). On consumer failure: re-read from the last committed offset. Partitioning by user_id ensures all events for a user are ordered within a partition, enabling sessionization. Multiple consumer groups allow the same events to feed independent pipelines without coupling.”}},{“@type”:”Question”,”name”:”How do you count unique users at scale without storing all user IDs?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”HyperLogLog (HLL) approximates distinct counts with ~1% error using O(log log n) memory — for 1 billion users, an HLL sketch uses ~12KB vs ~1GB for an exact bitmap. Redis PFADD event_hll:{event_type}:{hour} {user_id} adds a user to the HLL. PFCOUNT event_hll:{event_type}:{hour} returns the approximate unique count. PFMERGE event_hll:{event_type}:{day} hour1 hour2 … hour24 combines hourly sketches into a daily unique count. Error: ~0.81% standard error — acceptable for dashboards. For exact counts (required for billing, compliance): use Redis BITSET if user IDs are dense integers (SETBIT bitcount:{day} {user_id} 1; BITCOUNT bitcount:{day}). 1 billion users = 125MB bitset — feasible.”}},{“@type”:”Question”,”name”:”Why is a columnar database (Druid, ClickHouse, BigQuery) necessary for analytics?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”OLTP databases (PostgreSQL, MySQL) store data row-by-row — efficient for fetching complete records by primary key. Analytics queries aggregate across millions of rows but only need a few columns: SELECT country, COUNT(*) FROM events WHERE event_type='purchase' GROUP BY country. Row storage must read all columns even though only 2 are needed. Columnar storage (Druid, ClickHouse, BigQuery) stores each column as a contiguous block. The query reads only the event_type and country columns — skipping all other columns. Columnar storage also enables vectorized execution (SIMD instructions process 8-16 values per CPU instruction) and excellent compression (sorted column values compress 10-50x). Result: 10-100x faster analytical queries than row-based storage for the same data.”}},{“@type”:”Question”,”name”:”How do you implement session analysis in a real-time analytics pipeline?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Sessionization groups user events into sessions separated by inactivity gaps (typically 30 minutes). Stream-based sessionization: use Flink session windows (gap-based: session ends after 30min of no events for that user). Each session produces a session record: (user_id, session_id, start_time, end_time, event_count, pages_viewed[], first_event, last_event). Challenge: sessions can span multiple partitions' time windows. Solution: partition Kafka by user_id so all events for a user go to the same stream processor node. The processor maintains per-user state (last event time, current session). Emit the session record when the gap timer fires. Late events (network delay): use watermarks — process events up to 5 seconds late, then close the window.”}}]}

Databricks system design covers real-time analytics pipelines and streaming. See common questions for Databricks interview: real-time analytics and data pipeline design.

Twitter system design covers real-time analytics at massive scale. Review design patterns for Twitter/X interview: real-time analytics and event pipeline design.

LinkedIn system design covers analytics platforms and data pipelines. See design patterns for LinkedIn interview: analytics platform and data pipeline system design.

Scroll to Top