Question 1

How do you handle 11,000 ad clicks per second in a tracking system?

Accepted Answer

Key: decouple ingestion from processing with Kafka. The Click API accepts clicks and immediately publishes to a Kafka topic (sub-millisecond write) — it never waits for DB writes, fraud checks, or aggregation. Kafka buffers the stream. Multiple consumer groups read from the topic independently: fraud detection, real-time counter updates (Redis), raw event storage (Cassandra), and billing aggregation. Each consumer scales independently. The Click API is stateless and horizontally scalable — add more instances behind a load balancer. Redis handles real-time counter writes (INCR) at 100K+ ops/second. Cassandra handles raw event storage at high write throughput. The system absorbs spikes through Kafka's buffer rather than back-pressuring the ingestion path.

Question 2

How do you detect and filter invalid/bot ad clicks?

Accepted Answer

Multi-layer fraud detection: (1) IP rate limiting: INCR click_ip:{ip}:{ad_id}:{minute_bucket} in Redis. If count exceeds threshold (e.g., 3) within 60 seconds, mark the click as invalid. TTL on the key prevents unbounded growth. (2) User rate limiting: same pattern per user_id. (3) Missing/invalid user agent: reject clicks from headless browsers or known bot user agents. (4) Click timing: human clicks have variance; perfectly periodic clicks indicate automation. (5) Conversion tracking: if a click never leads to any downstream action (page view, conversion), it may be fraudulent — use this signal for batch retrospective analysis. Invalid clicks are written to a separate Kafka topic, never counted in billable metrics, but retained for analysis and refund processing.

Question 3

How do you count ad clicks for arbitrary time ranges efficiently?

Accepted Answer

Store per-minute counters in Redis: INCR click_count:{ad_id}:{YYYYMMDD_HH_MM}. TTL=48h. To query clicks for ad_id in time range [start, end]: enumerate all minute buckets covering the range, pipeline MGET for all keys, sum the values. A 24-hour query reads 1440 keys in one pipelined MGET round trip. For longer historical queries (30 days): use the hourly ClickSummary table in PostgreSQL (pre-aggregated from raw events). Query: SELECT SUM(valid_click_count) FROM click_summary WHERE ad_id=X AND time_bucket BETWEEN start AND end. Pre-aggregated hourly rows are much faster than scanning raw clicks. Redis is the real-time layer; DB is the historical layer.

Question 4

How do you enforce advertiser budget caps in real time?

Accepted Answer

Token bucket per campaign in Redis: key=campaign_budget:{campaign_id}, value=remaining_cents. On each billable click event, the billing consumer decrements the bucket: DECRBY campaign_budget:{campaign_id} cost_per_click_cents. If result goes below 0, mark campaign as paused in a Redis hash (campaign_status:{campaign_id}=PAUSED) and publish a budget_exhausted event. In the ad serving layer (before showing an ad), check campaign_status — if PAUSED, skip the ad. The status check has a short TTL (5s) for caching. Slight over-spend is acceptable due to eventual consistency (a few clicks may be served between exhaustion and the status propagating). For stricter enforcement, decrement budget before serving and reject the ad if insufficient budget remains.

Question 5

Why store raw click events in Cassandra rather than MySQL?

Accepted Answer

Ad click events are write-heavy (11K writes/second), time-series data (query by ad_id + time range), and append-only (clicks are never updated). Cassandra is optimized for exactly this pattern. Partition key = (ad_id, date) puts all clicks for an ad on the same day on the same node. Clustering key = click_id (or timestamp) provides efficient time-range queries within a partition. Write throughput: Cassandra handles tens of thousands of writes per second per node without write amplification (LSM-tree storage). MySQL would become a bottleneck at 11K writes/second and requires expensive sharding. Trade-offs: Cassandra does not support secondary indexes efficiently — cannot query by user_id across all ads without a separate index table.

Ad Click Tracker System Low-Level Design

Requirements

Data Flow Architecture

Click Ingestion

Fraud Detection

Real-Time Click Counting

Data Model (Persistent Storage)

Click Aggregation Pipeline

Budget Cap and Click Throttling

Key Design Decisions