System Design Interview: Ad Serving System (Google Ads / Meta Ads)

Ad serving is at the core of Google and Meta businesses and is one of the highest-stakes systems in the world — a 10ms increase in ad serving latency measurably reduces revenue. Designing an ad system involves real-time bidding, targeting, ranking, click fraud detection, and a billing system — all under strict latency budgets.

Two Models: Direct Ads vs Real-Time Bidding

Model How It Works Example
Direct / Reserved Advertiser buys specific inventory at a fixed price (e.g., homepage banner for $10K/day) Brand campaigns, sponsorships
Real-Time Bidding (RTB) Ad slot auctioned in real time (<100ms) when user loads the page; highest bidder wins Google Ads, Meta Ads, most programmatic advertising

RTB System Architecture

User visits web page
        ↓
[Publisher ad tag fires (JavaScript)]
        ↓ (HTTP request)
[Supply-Side Platform (SSP) or Ad Exchange]
  - Packages bid request: user data (anonymized), page context, ad slot dimensions
        ↓ (fan-out in parallel to multiple DSPs)
[Demand-Side Platforms (DSPs) / Bidders]
  Each DSP evaluates the impression and responds with a bid or no-bid
  Time limit: 80-100ms for the entire round trip
        ↓
[Auction in SSP]
  - Collects all bids
  - Second-price auction: winner pays second-highest bid + $0.01
        ↓
[Winning ad returned to browser]
        ↓
[Ad rendered — user sees ad]
        ↓ (async)
[Impression tracked]
        ↓ (if user clicks)
[Click tracked, URL redirected to advertiser]

Internal Ad Serving (First-Party, Like Google Search Ads)

User searches "best running shoes"
        ↓
[Query Understanding Service]
  - Parse query: ["running shoes"]
  - User profile: demographics, interest graph, past searches
  - Device: mobile, iOS, US West Coast, 3PM
        ↓
[Ad Candidate Retrieval — must be fast!]
  - Inverted index lookup: ads targeting keyword "running shoes"
  - User targeting filter: geo, device, language, demographics
  - Budget filter: advertiser must have remaining daily budget
  → 500 candidate ads in < 5ms
        ↓
[Ad Ranking]
  - Ad Rank = Quality Score * Max Bid
  - Quality Score = expected CTR * ad relevance * landing page quality
  - Higher Quality Score → lower cost per click (Google incentivizes quality)
  → Top 3-4 ads selected
        ↓
[Price Calculation]
  - Generalized Second Price (GSP) auction
  - Advertiser A (rank 1) pays: AdRank(B) / QualityScore(A) + $0.01
        ↓
[Ad Rendering and Impression Tracking]

Targeting System

Targeting dimensions:
  Keyword:      user searched "running shoes"
  Demographic:  age 25-34, gender male
  Geographic:   USA, California, within 10 miles of a running store
  Device:       iOS, mobile
  Behavioral:   visited competitor site in last 30 days (retargeting)
  Contextual:   article is about fitness and running
  Lookalike:    similar to advertiser existing customers (ML-generated segment)
  Time:         weekday mornings (commuters)

Implementation:
  Targeting rules stored as boolean logic trees
  Per-user segment membership cached in Redis sorted sets
  Fast evaluation: bitset intersection for segment matching

  Example ad targeting:
    Include: keyword=running AND (geo=California OR geo=Oregon)
             AND device=mobile AND age_group IN [25-34, 35-44]
    Exclude: existing_customer=true

  At retrieval time: fetch all ads matching keyword,
    filter by targeting rules using user segment bitsets

Click Fraud Detection

Invalid clicks (competitors clicking ads to drain budgets, bot farms, accidental clicks) cost advertisers billions. Fraud detection runs in real time:

Signals for click fraud:
  - Same IP clicking same ad repeatedly
  - Click with no subsequent page interaction (bot — loads page then leaves)
  - Device fingerprint matches known fraud patterns
  - Click rate far exceeds industry average for this ad type
  - Geographic mismatch: IP says US, browser timezone says Russia

Real-time filtering (synchronous, < 10ms):
  1. IP reputation blocklist (Redis SET lookup)
  2. Rate limiting: max 5 clicks per IP per ad per hour
  3. Click validity: must have valid HTTP Referer, user-agent

Async validation (within minutes):
  4. Session analysis: did user interact with landing page after click?
  5. Conversion funnel analysis: did any conversions follow?
  6. ML model: random forest on 100+ click features

Invalid clicks → not charged to advertiser
  Detected post-billing → credit applied to account

Budget Management

Advertisers set daily/monthly budgets. Budget enforcement must:
1. Never significantly overspend (legal liability, trust)
2. Spread budget evenly throughout the day (not spent in first hour)
3. Handle millions of concurrent ad campaigns

Architecture:
  Each campaign has a budget quota in Redis counter
  Per ad served: DECRBY budget:{campaign_id} {estimated_cost}
  If counter hits 0: stop serving ads for this campaign

  Problem: Redis counter updates are not perfectly real-time
  Solution: distribute budget into N time buckets (e.g., hourly)
    Allow 10% overage within the hour
    Reconcile exactly against billing system nightly

  Pacing algorithm (like LinkedIn):
    target_spend_rate = daily_budget / (hours_remaining * 3600)
    if current_rate > target_spend_rate * 1.1:
        throttle_serving_probability -= 0.1  # serve fewer ads
    if current_rate < target_spend_rate * 0.9:
        throttle_serving_probability += 0.1  # serve more ads

Performance Requirements

Operation Latency Budget Why
RTB bid response < 80ms total round-trip SSP imposes hard deadline; late bids ignored
Ad candidate retrieval < 10ms Leaves time for ranking and network
Ad ranking < 20ms Complex ML model on candidate set
Click tracking < 5ms (redirect) User notices redirect latency
Impression tracking Async (fire and forget) Never in the critical path

Interview Discussion Points

  • What is the second-price auction and why is it used? Winner pays second-highest bid + $0.01 (not their own bid). This is strategy-proof — advertisers bid their true value, not a strategic undervalue. Google uses Generalized Second Price for multi-slot auctions
  • How do you handle the cold-start problem for new advertisers? No historical CTR or quality score data. Use category averages for CTR estimation; allow small serving with learning mode; bootstrap quality score from ad content analysis
  • How do you A/B test ad ranking changes safely? Shadow scoring (compute new ranking alongside current, compare offline), holdback experiment (5% traffic gets new ranking), measure revenue per search and CTR carefully — small changes in ad ranking have massive revenue implications

Frequently Asked Questions

How does real-time bidding (RTB) work in digital advertising?

RTB is an auction that happens in under 100ms while a webpage loads. When a user visits a page, the publisher's ad server sends a bid request to an SSP (Supply-Side Platform) containing the ad slot dimensions, user ID (hashed), page URL, and audience segments. The SSP broadcasts the bid request to dozens of DSPs (Demand-Side Platforms). Each DSP looks up the user's audience segments in its data store, evaluates which advertiser campaigns match, calculates a bid price based on predicted click probability and campaign budget, and responds with a bid in under 50ms. The SSP runs a second-price auction: the highest bidder wins but pays the second-highest bid price plus one cent. The winning DSP's ad creative URL is returned to the browser, which then fetches and renders the ad. The entire process completes in 80-100ms alongside the page load.

How does ad budget pacing work to prevent overspending?

Budget pacing distributes ad spend evenly across a campaign's flight dates rather than spending the entire budget in the first hours. The algorithm calculates a target spend rate: daily_budget / seconds_per_day = dollars_per_second. A pacing service checks actual vs target spend every minute. When spend is ahead of pace (spending too fast), it reduces the bid multiplier or temporarily halts bidding for that campaign. When spend is behind pace (spending too slow), it increases the bid multiplier to win more auctions. The pacing state is stored in Redis with atomic increment operations to handle high write throughput across thousands of concurrent auctions. At the end of the day, any unspent budget can be rolled into the next day (catch-up mode) or forfeited. A secondary safety check runs at the impression level: the ad server verifies remaining budget before serving any impression, using optimistic locking to prevent double-spend race conditions.

How do you detect and prevent click fraud in an ad system?

Click fraud detection operates at three layers: (1) Real-time rules — at click time, check IP rate limits (same IP clicking the same ad more than 3 times in an hour is invalid), user agent blocklists (known bot signatures), and click-through rate anomalies (a publisher's CTR suddenly 10x the category average). Invalid clicks are discarded immediately and advertisers are not charged. (2) Near-real-time ML scoring — a stream processing job (Kafka + Flink) aggregates click patterns and scores each click using a gradient boosting model trained on labeled data (click farms leave patterns: fixed timing intervals, identical device fingerprints, sequential IP addresses). High-score clicks are flagged for review within minutes. (3) Retroactive audit — daily batch jobs reprocess all clicks with the latest fraud signals. Advertisers are credited for clicks that are later classified as fraudulent. Google IVT (Invalid Traffic) reports give advertisers visibility into this process.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does real-time bidding (RTB) work in digital advertising?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “RTB is an auction that happens in under 100ms while a webpage loads. When a user visits a page, the publisher’s ad server sends a bid request to an SSP (Supply-Side Platform) containing the ad slot dimensions, user ID (hashed), page URL, and audience segments. The SSP broadcasts the bid request to dozens of DSPs (Demand-Side Platforms). Each DSP looks up the user’s audience segments in its data store, evaluates which advertiser campaigns match, calculates a bid price based on predicted click probability and campaign budget, and responds with a bid in under 50ms. The SSP runs a second-price auction: the highest bidder wins but pays the second-highest bid price plus one cent. The winning DSP’s ad creative URL is returned to the browser, which then fetches and renders the ad. The entire process completes in 80-100ms alongside the page load.”
}
},
{
“@type”: “Question”,
“name”: “How does ad budget pacing work to prevent overspending?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Budget pacing distributes ad spend evenly across a campaign’s flight dates rather than spending the entire budget in the first hours. The algorithm calculates a target spend rate: daily_budget / seconds_per_day = dollars_per_second. A pacing service checks actual vs target spend every minute. When spend is ahead of pace (spending too fast), it reduces the bid multiplier or temporarily halts bidding for that campaign. When spend is behind pace (spending too slow), it increases the bid multiplier to win more auctions. The pacing state is stored in Redis with atomic increment operations to handle high write throughput across thousands of concurrent auctions. At the end of the day, any unspent budget can be rolled into the next day (catch-up mode) or forfeited. A secondary safety check runs at the impression level: the ad server verifies remaining budget before serving any impression, using optimistic locking to prevent double-spend race conditions.”
}
},
{
“@type”: “Question”,
“name”: “How do you detect and prevent click fraud in an ad system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Click fraud detection operates at three layers: (1) Real-time rules — at click time, check IP rate limits (same IP clicking the same ad more than 3 times in an hour is invalid), user agent blocklists (known bot signatures), and click-through rate anomalies (a publisher’s CTR suddenly 10x the category average). Invalid clicks are discarded immediately and advertisers are not charged. (2) Near-real-time ML scoring — a stream processing job (Kafka + Flink) aggregates click patterns and scores each click using a gradient boosting model trained on labeled data (click farms leave patterns: fixed timing intervals, identical device fingerprints, sequential IP addresses). High-score clicks are flagged for review within minutes. (3) Retroactive audit — daily batch jobs reprocess all clicks with the latest fraud signals. Advertisers are credited for clicks that are later classified as fraudulent. Google IVT (Invalid Traffic) reports give advertisers visibility into this process.”
}
}
]
}

Companies That Ask This Question

Scroll to Top