An ad serving system delivers targeted advertisements to users in real time, managing the full lifecycle from auction to billing. This design covers the core components: ad schema, auction logic, impression and click tracking, CTR prediction, frequency capping, budget pacing, and fraud detection.
Ad Schema
Each ad record captures identity, creative, pricing, targeting, and lifecycle state:
ads
id BIGINT PRIMARY KEY
advertiser_id BIGINT NOT NULL
creative_url VARCHAR(512)
bid_price DECIMAL(10,4) -- max CPM advertiser will pay
targeting JSONB -- geo, age range, interests, keywords
daily_budget DECIMAL(12,2)
status ENUM('active','paused','exhausted')
Targeting is stored as a JSONB blob containing arrays for geo, age_range, interests, and keywords, allowing flexible predicate evaluation without schema migrations.
Auction Flow
When a page load triggers an ad request, the system runs a second-price auction in under 10ms:
- Receive ad request — user context (geo, device, page category, user segment) arrives at the ad server.
- Filter eligible ads — query active ads whose targeting predicates match the user context. Status must be
active(budget not exhausted). - Score candidates — compute effective CPM:
eCPM = bid_price × predicted_CTR. This ranks ads by expected revenue, not raw bid. - Second-price auction — the ad with the highest eCPM wins. The winner pays the second-highest bid plus $0.01, incentivizing truthful bidding.
- Serve creative — return
creative_urlwith impression beacon and click tracking URLs embedded.
Impression Tracking
Impression recording must not block ad delivery. When the browser loads the creative, a 1×1 pixel beacon fires to the tracking endpoint. The server writes the event asynchronously to Kafka with fields: ad_id, user_id, timestamp, price_paid, placement_id.
A stream processor (Flink or Spark Streaming) aggregates impressions per advertiser per hour, writing rollups to the billing database. Raw events are retained in object storage for audit and dispute resolution.
Click Tracking and Deduplication
Click URLs route through a redirect service that records the click event before forwarding the user to the landing page. Deduplication uses a Redis set keyed by ad_id:user_id:date — if the click fingerprint already exists, it is discarded. This handles double-clicks and browser retries. Only deduplicated clicks are billed as CPC conversions.
CTR Prediction Model
Predicted click-through rate is the multiplier that turns a raw bid into a competitive eCPM score. Features include: user segment, device type, page category, ad creative ID, hour of day, and historical CTR of the ad. A logistic regression baseline trains quickly and is interpretable. A neural network (shallow MLP or wide-and-deep) captures feature interactions for higher-traffic placements. Models are retrained daily on the previous window of impression/click logs and served via a low-latency feature store.
Frequency Capping
To avoid user fatigue, each ad enforces a maximum impression count per user per day. Implementation: a Redis counter keyed freq:{ad_id}:{user_id}:{date} with a TTL of 24 hours. At auction time, any ad whose counter meets or exceeds the cap is removed from the candidate set before scoring. Increment is atomic (INCR) and happens post-serve to avoid blocking.
Budget Pacing
Advertisers want spend distributed smoothly across the day, not exhausted in the first hour. The pacing algorithm computes a target spend rate: target_rate = remaining_budget / remaining_hours. A token bucket refills at this rate. Each impression deducts from the bucket. When the bucket is empty, the ad is temporarily suppressed until the next refill tick. When the full daily budget is consumed, status is set to exhausted and the ad is excluded from all auctions for the rest of the day.
Click Fraud Detection
Invalid click traffic erodes advertiser trust. Two primary signals:
- Velocity anomaly — if a single IP or user_id generates clicks on the same ad at a rate exceeding a threshold (e.g., more than 5 clicks per minute), subsequent clicks are flagged and withheld from billing.
- Bot fingerprinting — requests lacking expected browser headers, with suspicious user-agent strings, or exhibiting no mouse-movement events (for display ads with JS tracking) are scored as bot traffic and excluded from auction eligibility or post-click billing.
Flagged events are quarantined, not deleted, allowing periodic model-based retroactive review and credit issuance to affected advertisers.
Frequently Asked Questions
Q: How does second-price auction mechanics work in an ad serving system?
A: In a second-price (Vickrey) auction, the highest bidder wins but pays the price of the second-highest bid plus one cent. Each advertiser submits a sealed bid representing their maximum willingness to pay per impression. The ad server ranks all eligible bids, selects the winner, and charges them the runner-up bid amount. This mechanism incentivizes advertisers to bid their true value because overbidding doesn’t reduce cost and underbidding risks losing the auction. Real-time bidding (RTB) systems run this process end-to-end within roughly 100 ms, including ad selection, auction clearing, and response serialization.
Q: What model is commonly used for CTR prediction in ad serving, and how is it trained?
A: Click-through rate (CTR) prediction typically uses logistic regression or gradient-boosted decision trees (e.g., XGBoost) combined with deep learning embeddings for sparse categorical features. Features include user demographics, historical click behavior, ad creative attributes, contextual signals (page content, device type), and time-of-day. Models are trained on billions of labeled impressions using log-loss as the objective. Online learning with follow-the-regularized-leader (FTRL) allows incremental updates to keep the model fresh. Calibration steps ensure predicted probabilities align with observed click rates across deciles.
Q: How is frequency capping implemented using Redis in an ad serving system?
A: Frequency capping limits how many times a single user sees a given ad within a time window (e.g., no more than 5 impressions per 24 hours). Redis is the standard choice because of its sub-millisecond latency and atomic increment operations. The typical approach uses a Redis key of the form “fc:{user_id}:{ad_id}:{window}”, incremented with INCR on each impression and set to expire after the window duration using EXPIRE. Before serving, the system checks if the counter exceeds the cap. To handle high write throughput, counters can be sharded across Redis nodes by hashing the user ID. For approximate counting at massive scale, Redis HyperLogLog or Bloom filters can reduce memory footprint.
Q: How does budget pacing work in an ad serving system at daily and hourly granularity?
A: Budget pacing ensures advertisers spend their budget smoothly rather than exhausting it in the first hour of the day. At the daily level, the system computes a target spend rate (daily budget / 24 hours) and compares actual spend to the ideal cumulative spend curve using a throttle ratio. If spend is ahead of pace, the ad server probabilistically drops eligible bids; if behind pace, it increases bid eligibility. Hourly pacing subdivides the daily budget into hourly targets with carry-over adjustments for under- or over-delivery from prior hours. A centralized budget service (backed by Redis or a distributed counter) tracks real-time spend and broadcasts throttle signals to ad selection servers every few seconds to minimize latency.
Q: What does an ad targeting pipeline look like at the component level?
A: The ad targeting pipeline transforms a raw ad request into a ranked list of candidate ads through several stages. First, the retrieval stage uses inverted indexes on targeting attributes (age, location, interests, keywords) to fetch thousands of candidate ads in milliseconds. Second, the filtering stage removes ads that fail eligibility checks: budget exhaustion, frequency cap violations, brand safety rules, and policy constraints. Third, the scoring stage applies the CTR and conversion prediction models to each candidate, multiplying predicted CTR by bid price to compute expected revenue (eCPM). Fourth, the auction stage runs the second-price auction on the top-scoring candidates. Finally, the ad server returns the winning creative along with tracking pixels and billing signals. The entire pipeline must complete within 50–100 ms end-to-end.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering