An ad click tracking system records every click on an advertisement, deduplicates fraudulent clicks, aggregates counts for billing, and provides real-time analytics to advertisers. At Google or Facebook scale, this system handles billions of click events per day and must be precise (incorrect click counts affect billing) and fraud-resistant.
Click Event Flow
When a user clicks an ad: (1) The browser makes a request to the tracking server (not the advertiser directly) via a redirect URL: track.example.com/click?ad_id=123&user_id=456&placement_id=789. (2) The tracking server records the click event and immediately redirects the user to the advertiser’s landing page (301 or 302 redirect). (3) The click event is enqueued to Kafka. (4) Downstream consumers process the click: deduplication, fraud detection, billing aggregation, real-time analytics. The redirect approach ensures the click is captured even if the advertiser’s server is slow — the user reaches the landing page immediately while the tracking happens asynchronously in step 3-4.
Click Deduplication
Fraudulent clicks (click farms, repeated user clicks) must be deduplicated before billing. A legitimate click is one click per user per ad per time window. Deduplication key: (user_id, ad_id, time_window). Implementation: when a click event arrives, compute the deduplication key and check a Redis set: SETNX dedup:{user_id}:{ad_id}:{hour} 1 EX 3600. If SETNX returns 0 (key already exists), this click is a duplicate — mark as invalid. If SETNX returns 1, this is the first click in this window — valid. The 1-hour window prevents the same user from clicking the same ad 100 times in an hour. More sophisticated fraud detection: IP rate limiting, bot behavior patterns (machine-speed clicks, missing browser fingerprint), network analysis (many users from the same IP).
Aggregation for Billing
Advertisers are billed per valid click. Aggregation requirements: count of valid clicks per (ad_id, date) for daily billing. Two approaches: (1) Stream processing: Kafka → Flink/Spark Streaming → aggregate valid clicks per ad per minute → write to a time-series store; reconcile at end-of-day for billing. Provides near-real-time click counts for advertiser dashboards. (2) Batch processing: store all click events in object storage (S3 as Parquet); run daily aggregation jobs (Spark, BigQuery) for billing. Simpler, cheaper, but only provides next-day billing data. Production: combine both — stream processing for near-real-time dashboards, batch for authoritative billing (stream may have minor inaccuracies that batch corrects).
Click Attribution
Attribution determines which ad “caused” a conversion (purchase, signup). Last-click attribution: the last ad the user clicked before converting gets credit. Multi-touch attribution: credit is distributed across all ads in the user’s click path (linear, time-decay, or data-driven models). Implementation: store a click_path per user (last N clicks with timestamps) in Redis or a user profile store. When a conversion event occurs (with user_id), look up the user’s click path and apply the attribution model to assign credit. Attribution windows (7 days, 30 days): only count clicks within the attribution window as attributable. Clicks older than the window don’t receive credit for the conversion.
Privacy and GDPR
Click tracking involves user tracking — a sensitive area under GDPR, CCPA, and iOS App Tracking Transparency (ATT). Compliance requirements: obtain user consent before tracking (consent banner, ATT prompt on iOS); support data deletion requests (remove user’s click history on erasure request); minimize data retention (delete detailed click records after the billing reconciliation period — typically 90 days, not forever); anonymize aggregated data before long-term retention. Cookieless tracking alternatives (privacy sandbox, aggregated measurement APIs) are increasingly required as third-party cookies are blocked by browsers. Design tracking identifiers that don’t rely on cross-site cookies for future-proofing.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Atlassian Interview Guide
See also: Coinbase Interview Guide
See also: Shopify Interview Guide
See also: Snap Interview Guide
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems