What is a Content Delivery Network (CDN) and how does it work?

A CDN is a geographically distributed network of proxy servers and data centers that deliver web content to users from the location closest to them. When a user requests content, the CDN routes the request to the nearest edge server, which either serves cached content or fetches it from the origin server, reducing latency and load on the origin.

How do you design a CDN to handle cache invalidation at scale?

Cache invalidation at scale typically uses a combination of TTL-based expiry, event-driven purge APIs, and versioned URLs. Large CDNs like those operated by Amazon CloudFront or Google Cloud CDN support purge-by-tag or surrogate-key invalidation so operators can bust millions of cached objects with a single API call without waiting for TTLs to expire.

What are the key components of a CDN system design?

Core components include: edge PoPs (Points of Presence) with local caches, a BGP anycast or DNS-based routing layer that directs users to the nearest PoP, an origin shield or mid-tier cache to protect the origin from cache-miss storms, a control plane for configuration and purge propagation, and observability infrastructure for real-time traffic metrics.

How does a CDN handle dynamic vs. static content differently?

Static content (images, JS, CSS, videos) is highly cacheable and served directly from edge nodes with long TTLs. Dynamic content requires either bypassing the cache entirely and proxying to origin, or using edge-side rendering and short TTLs with stale-while-revalidate patterns. Some CDNs support edge compute (e.g., Lambda@Edge, Cloudflare Workers) to personalize or generate dynamic responses at the edge without hitting the origin.

Low Level Design: Content Delivery Network (CDN)

⏱ 5 min read

What Is a Content Delivery Network?

A Content Delivery Network (CDN) is a geographically distributed system of proxy servers and data centers that delivers web content to users based on their physical location. The primary goal is to reduce latency, improve load times, and offload origin servers by serving cached copies of static and dynamic assets from edge nodes closest to the requesting client.

CDNs are used by virtually every high-traffic website to serve images, JavaScript, CSS, video, and API responses. Companies like Cloudflare, Akamai, and AWS CloudFront operate global CDN infrastructure with hundreds of points of presence (PoPs).

Data Model and Schema

A CDN requires several core data structures to track cached objects, routing rules, and health state.

-- Origin configuration
TABLE origins (
  id          BIGINT PRIMARY KEY,
  domain      VARCHAR(255) NOT NULL,
  origin_url  VARCHAR(512) NOT NULL,
  ttl_default INT DEFAULT 3600,
  created_at  TIMESTAMP
);

-- Edge node registry
TABLE edge_nodes (
  id          BIGINT PRIMARY KEY,
  region      VARCHAR(64),
  ip_address  VARCHAR(45),
  capacity_gb INT,
  status      VARCHAR(16)  -- 'active', 'draining', 'offline'
);

-- Cache object metadata
TABLE cache_entries (
  cache_key   VARCHAR(512) PRIMARY KEY,
  edge_node   BIGINT REFERENCES edge_nodes(id),
  origin_id   BIGINT REFERENCES origins(id),
  etag        VARCHAR(128),
  expires_at  TIMESTAMP,
  size_bytes  BIGINT,
  hit_count   BIGINT DEFAULT 0
);

In practice, edge node metadata lives in a distributed key-value store (Redis or Memcached), while persistent configuration is kept in a relational database at the control plane.

Core Algorithm and Workflow

When a client makes a request, the CDN follows this flow:

DNS Resolution: Anycast DNS or GeoDNS routes the client to the nearest PoP based on IP geolocation or latency probing.
Cache Lookup: The edge node computes a cache key (typically method + host + path + relevant headers) and checks its local cache. On a hit, the cached response is returned immediately with an X-Cache: HIT header.
Origin Fetch (cache miss): On a miss, the edge node forwards the request to the origin, stores the response in local cache per the Cache-Control or configured TTL, and returns it to the client.
Cache Revalidation: On expiry, a conditional GET (If-None-Match / If-Modified-Since) is sent to origin. A 304 response extends the cache entry without retransmitting the body.

function handle_request(req):
  key = build_cache_key(req)
  entry = cache.get(key)
  if entry and not entry.expired():
    return respond(entry, headers={X-Cache: HIT})
  if entry and entry.stale():
    origin_resp = fetch_with_revalidation(req, entry.etag)
    if origin_resp.status == 304:
      entry.extend_ttl()
      return respond(entry, headers={X-Cache: REVALIDATED})
  origin_resp = fetch_from_origin(req)
  cache.set(key, origin_resp, ttl=compute_ttl(origin_resp))
  return respond(origin_resp, headers={X-Cache: MISS})

Failure Handling and Performance

Origin failover: If the origin is unreachable, the CDN can serve stale content using stale-if-error directives, or route to a secondary origin.
Circuit breakers: Edge nodes track origin error rates. When errors exceed a threshold, the circuit opens and stale or fallback content is served without hammering the origin.
Request coalescing: Multiple simultaneous cache misses for the same key collapse into a single upstream request (also called request collapsing) to prevent thundering herd on the origin.
Health checks: The control plane continuously probes edge nodes and origin endpoints, removing unhealthy nodes from DNS rotation automatically.

Scalability Considerations

CDN scalability is achieved at multiple layers:

Horizontal edge scaling: Adding more PoPs in new regions reduces geographic latency without changes to the core system.
Tiered caching: A two-tier architecture uses regional parent caches between edge nodes and origin, dramatically reducing origin traffic for long-tail content.
Cache hit ratio optimization: Normalizing query parameters, stripping non-essential headers from cache keys, and tuning TTLs based on content type all improve the cache hit ratio.
Streaming and chunked delivery: Large files are split into range-request chunks so partial delivery can begin before the full object is cached, improving time-to-first-byte.

Summary

A CDN is a distributed caching layer that uses GeoDNS routing, edge node caches, and smart revalidation to serve content faster and more reliably than a single origin ever could. Key design decisions include cache key construction, TTL strategy, request coalescing, and tiered topology. At interview time, focus on the cache miss path, failure modes (stale-if-error, circuit breakers), and how cache invalidation propagates across a global fleet.