Low Level Design: Ad Auction System

What Is an Ad Auction System?

An Ad Auction System determines which advertiser wins a given ad slot and how much they pay. The most widely used mechanism is the second-price (Vickrey) auction: the highest bidder wins but pays the second-highest bid plus one minimum increment. This design gives advertisers an incentive to bid their true value, simplifying campaign management. Google AdWords and most programmatic display exchanges use variants of this model.

The auction sits between the targeting layer (which produces a qualified candidate set) and the delivery layer (which renders and serves the winning creative). It must complete in under 5 ms so the full ad decision pipeline stays within real-time constraints.

Data Model

CREATE TABLE bids (
    bid_id          BIGINT PRIMARY KEY,
    campaign_id     BIGINT REFERENCES campaigns(campaign_id),
    ad_group_id     BIGINT,
    creative_id     BIGINT,
    base_cpm        DECIMAL(10,4),   -- max CPM in USD
    bid_modifier    DECIMAL(6,4) DEFAULT 1.0
);

CREATE TABLE auction_log (
    auction_id      UUID PRIMARY KEY,
    slot_id         BIGINT,
    user_id         BIGINT,
    timestamp       TIMESTAMP,
    winner_bid_id   BIGINT,
    clearing_price  DECIMAL(10,4),
    second_price    DECIMAL(10,4)
);

CREATE TABLE quality_scores (
    bid_id          BIGINT REFERENCES bids(bid_id),
    score           DECIMAL(5,4),   -- 0.0 to 1.0
    updated_at      TIMESTAMP,
    PRIMARY KEY (bid_id)
);

Core Algorithm: Second-Price Auction with Quality Score

Raw bid values alone produce poor outcomes: a high-CPM creative with terrible click-through rate wastes the user's attention and the publisher's inventory. The effective bid is therefore:

effective_bid = base_cpm * bid_modifier * quality_score

The auction engine executes as follows:

  1. Receive candidate set from the targeting service (typically 50-500 bids for a single slot).
  2. Compute effective bids for each candidate using the formula above. Quality scores are pre-computed offline and cached in memory.
  3. Sort descending by effective bid. A partial sort (finding the top-2 elements) runs in O(n) rather than O(n log n) since only the winner and runner-up matter.
  4. Determine clearing price: clearing_price = second_effective_bid / winner_quality_score. This converts back from effective CPM space to a raw CPM the winner actually pays.
  5. Log the auction result asynchronously to the auction_log table. The winning creative ID is returned to the ad server.

Floor prices set by the publisher are enforced before the sort: any bid below the floor is discarded. If no bid clears the floor, the slot is left unfilled or backfilled with house ads.

Failure Handling and Latency Requirements

  • Target latency: <5 ms per auction, P99 <15 ms.
  • Bid data in memory: All active bids and quality scores are held in a pre-built array per slot type, loaded at startup and refreshed on a short TTL. No database reads on the critical path.
  • Async logging: Auction logs are written to a local write-ahead buffer and flushed to the database in micro-batches. Loss of a few log records during a crash is acceptable; losing the auction decision is not.
  • Timeout budget: If the targeting service is slow, the auction uses the last-known candidate set from a request-scoped cache rather than waiting, taking a small relevance hit.

Scalability Considerations

Auction servers are stateless and horizontally scalable. Bid data is pushed from the campaign management system via an event stream (Kafka) and materialized in each node's local cache. For very high QPS (millions of auctions per second), the fleet is sharded by publisher or placement to improve CPU cache locality. Quality score recomputation runs as an offline Spark job and publishes updates through the same Kafka pipeline. Rate limiting per advertiser prevents a single large campaign from consuming disproportionate CPU during bid evaluation.

Summary

The Ad Auction System is a compute-intensive, microsecond-budget component. The second-price mechanism with quality score weighting aligns advertiser incentives with publisher revenue and user experience. Interview candidates should be prepared to derive the clearing price formula, explain why second-price auctions are incentive-compatible, and discuss how quality scores are computed and kept fresh without impacting auction latency.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What auction mechanism is most commonly used in online advertising?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The second-price (Vickrey) auction is the most widely used mechanism. The winner pays the price of the second-highest bid plus one cent. This incentivizes advertisers to bid their true value because overbidding does not lower cost and underbidding risks losing the impression without saving money.”
}
},
{
“@type”: “Question”,
“name”: “How does a Generalized Second-Price (GSP) auction differ from a Vickrey-Clarke-Groves auction?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “In a GSP auction each winner pays the bid of the next-ranked advertiser, applied slot by slot, which is not always incentive-compatible. A VCG auction charges each winner their externality on other bidders, guaranteeing truthful bidding is a dominant strategy. Most ad platforms use GSP for simplicity despite its theoretical drawbacks.”
}
},
{
“@type”: “Question”,
“name”: “How do you incorporate Quality Score or ad rank into the auction?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Effective bid is computed as the raw bid multiplied by a predicted quality score, which reflects estimated click-through rate and ad relevance. Ads are ranked by effective bid, so a lower monetary bid paired with high quality can outrank a higher monetary bid with poor quality. This aligns advertiser incentives with user experience.”
}
},
{
“@type”: “Question”,
“name”: “What are the main scalability bottlenecks in running a real-time ad auction?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The main bottlenecks are candidate retrieval (finding eligible ads from billions of campaigns within milliseconds), ranking model inference latency, and budget pacing checks. Common mitigations include pre-filtering with inverted indexes, approximate nearest-neighbor search, model distillation for fast inference, and token-bucket rate limiters for pacing.”
}
}
]
}

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top