System Design Interview: Global Content Delivery and Edge Computing

Q: What is the difference between a CDN and edge computing?

A traditional CDN primarily caches and serves static content (images, CSS, JavaScript, videos) from geographically distributed servers (PoPs) to reduce latency and origin server load. The CDN logic is simple: check if the requested content is cached; if yes, serve it; if no, fetch from origin, cache it, and serve it. CDN behavior is configured via HTTP headers (Cache-Control, Vary) and cannot execute custom business logic. Edge computing extends this model by allowing arbitrary code to run at the CDN PoPs. Cloudflare Workers, Lambda@Edge, and Fastly Compute@Edge execute your JavaScript, WASM, or Python code at hundreds of PoPs, within 5-20ms of users worldwide. This enables: request routing and rewriting before hitting origin, authentication/authorization at the edge (validate JWT without origin round-trip), A/B testing (split traffic at the CDN level), personalization (serve different content based on cookie/geo), and API responses served entirely from the edge without touching origin servers. The performance difference: a dynamic API response that required a 200ms round-trip to us-east-1 origin can be served in 10ms from a Cloudflare PoP 50km from the user — 20× latency reduction.

Q: How does cache invalidation work at CDN scale?

Cache invalidation at CDN scale is challenging because popular content may be cached across hundreds of PoPs globally. When content changes, all cached copies must be updated or invalidated. Approaches in order of reliability: (1) Content hash versioning (cache-busting): embed a content hash in the filename (app.a3f8b2c1.css). The CDN caches with a 1-year TTL. When content changes, the filename changes, so old URLs are never requested — they simply expire via TTL. New deployments generate new filenames. Zero invalidation API calls needed; zero stale content risk. Best approach for static assets. (2) TTL expiration: set Cache-Control: max-age=N. Content goes stale after N seconds and is re-fetched. Simple but imprecise — content may be stale for up to N seconds. Use short TTLs (60-300s) for semi-dynamic content (product prices, inventory status). (3) API-based purge: call the CDN's purge API to invalidate specific URLs immediately. Cloudflare propagates purges to all 300+ PoPs in under 150ms. Used for urgent content updates (breaking news, pricing errors). Scale limitation: purging millions of URLs is slow — use surrogate keys/cache tags instead. (4) Surrogate key tags: tag each cached response with logical identifiers (product:12345, category:electronics). A single API call purges all responses tagged with a key. Fastly and Cloudflare support this. Used to invalidate all pages showing a specific product when its inventory changes.

Q: How does anycast protect against DDoS attacks?

Anycast is a network routing technique where the same IP address is announced from multiple data centers simultaneously. When an attacker sends a DDoS flood to the target IP, BGP routing delivers the traffic to the nearest announcing node from the attacker's perspective. Since a global CDN like Cloudflare announces the same IPs from 300+ PoPs worldwide, a DDoS attack originating from, say, Eastern Europe is automatically routed to Cloudflare's Frankfurt and Warsaw PoPs — without reaching US or Asian PoPs at all. This geographic distribution means a 1 Tbps volumetric DDoS attack is spread across the PoPs nearest the attack's source. If those PoPs have 5 Tbps of capacity combined, the attack is absorbed. The origin servers behind the CDN never see the attack traffic. Additionally, each PoP applies filtering: rate limiting per source IP, signature-based filtering for known attack patterns (SYN flood amplification, UDP reflection), and IP reputation blocklists. The key advantage of anycast over unicast: you cannot overwhelm a single PoP by targeting its IP, because the routing adapts — if one PoP becomes saturated, BGP can withdraw its announcement and reroute traffic to adjacent PoPs.

⏱ 10 min read

Beyond Basic CDN: Edge Computing

A CDN (Content Delivery Network) has historically served static assets — images, CSS, JS — from geographically distributed servers (PoPs). Edge computing extends this by executing logic at the edge: authentication, A/B testing, request routing, personalization, and API responses — without a round-trip to the origin server. This reduces latency from 200ms+ (origin in another region) to 5-20ms (edge PoP in the user’s city). Cloudflare Workers, Lambda@Edge, and Fastly Compute@Edge are the major platforms.

CDN Architecture

Two-Tier Cache Hierarchy

Modern CDNs use a two-tier hierarchy to maximize cache efficiency:

Tier 1 (edge PoPs): 200-300 geographically distributed PoPs, each with 1-10TB of SSD cache. A miss at the edge PoP does not immediately hit the origin.
Tier 2 (origin shield / mid-tier): 10-20 regional PoPs with much larger cache (100TB+ HDD). A Tier 1 miss fetches from the nearest Tier 2 PoP. Only a Tier 2 miss hits the origin.

For popular content (CDN hit rate 95%+), the origin receives very few requests. The Tier 2 layer is the “origin shield” — it absorbs the long-tail misses and serializes requests to the origin, preventing thundering herds when a popular object expires simultaneously from many Tier 1 PoPs.

Cache Invalidation Strategies

After a deployment or content update, stale cached content must be invalidated or updated:

TTL-based expiration: every cached object has a max-age (Cache-Control: max-age=86400). After TTL expires, the edge refetches from origin. Simple but imprecise — content may be stale for up to TTL duration. Use short TTLs (60s) for dynamic content; long TTLs (1 year) for versioned static assets.
Purge by URL: explicitly invalidate specific URLs via CDN API after a content update. Cloudflare, Fastly, and CloudFront all support purge APIs. Latency: a purge propagates to all PoPs in 1-5 seconds. Use for targeted invalidation of a small number of URLs.
Cache-busting (versioned URLs): embed a content hash in the filename: app.a3f8b2c.js. Each deployment generates new filenames. Old files are never explicitly purged — they expire naturally via TTL. New browsers automatically get new content because the URL changed. This is the most reliable approach for static assets.
Surrogate keys / cache tags: tag cached responses with logical keys (product:123, user:456). Purge all responses with a given tag in one API call — useful for purging all pages that show a specific product after its price changes.

Edge Functions: Cloudflare Workers

Cloudflare Workers run JavaScript (and Rust/WASM) at 300+ PoPs globally, executing within 0-5ms of the user. Workers intercept every request before it reaches the origin, enabling:


// Cloudflare Worker: A/B testing at the edge
export default {
    async fetch(request, env) {
        const url = new URL(request.url);

        // Route 10% of users to new checkout
        const userId = request.headers.get("X-User-Id") ?? "";
        const bucket = parseInt(userId.slice(-2), 16) % 100;  // hash to 0-99
        if (bucket < 10 && url.pathname === "/checkout") {
            url.hostname = "checkout-v2.example.com";
            return fetch(new Request(url, request));
        }

        // Serve from cache for static assets
        const cache = caches.default;
        let response = await cache.match(request);
        if (!response) {
            response = await fetch(request);
            if (response.ok && url.pathname.startsWith("/static/")) {
                const toCache = response.clone();
                event.waitUntil(cache.put(request, toCache));
            }
        }
        return response;
    }
};

Workers use V8 isolates (not containers) — cold start is <1ms. Each request runs in an isolated JavaScript context with no shared state between requests. Workers KV provides distributed key-value storage readable from any PoP (eventually consistent, 60ms propagation globally). Durable Objects provide strongly consistent, single-instance stateful objects at the edge — useful for rate limiting, presence, and collaborative editing.

Lambda@Edge vs Cloudflare Workers

Feature	Cloudflare Workers	Lambda@Edge
Cold start	<1ms (V8 isolates)	100-500ms (Node.js containers)
PoP count	300+	~25 (CloudFront edge locations)
Max execution time	50ms (free), 30s (Unbound)	5s (viewer), 30s (origin)
Storage at edge	Workers KV, Durable Objects, R2	Limited (SSM, DynamoDB global)
Pricing	$0.50/million requests	$0.60/million + Lambda duration
Programming model	JavaScript, WASM	Node.js, Python

DDoS Mitigation at the Edge

The edge is the first line of DDoS defense — volumetric attacks are absorbed at the PoP closest to the attacker, never reaching the origin. Techniques:

Anycast absorption: traffic to the CDN’s anycast IP is distributed across all PoPs globally. A 1Tbps attack is spread across 300 PoPs — each handles 3-4Gbps, well within capacity.
Rate limiting at edge: Cloudflare Rate Limiting rules apply per IP/ASN at the PoP level. Requests exceeding threshold return 429 without hitting origin.
Bot detection: CAPTCHAs and JavaScript challenges at the edge filter automated traffic before it reaches application servers.
IP reputation: block known malicious IPs and AS ranges using real-time threat intelligence feeds. Cloudflare processes 2 trillion requests/day, building a global threat map.

Edge Caching for APIs (Dynamic Content)

Edge caching is not just for static files. JSON API responses can be cached with short TTLs:

Product catalog: Cache-Control: public, max-age=300 (5 minutes) — serves 99% of product page reads from edge
User-specific data: Cache-Control: private (no edge cache); fetch from origin with user token
Authenticated API responses: Vary: Authorization header — different cache entries per auth token (large cardinality, avoid)
Edge-side rendering: render HTML at the edge using pre-fetched API data — full page served from PoP without origin round-trip

Key Interview Points

Two-tier CDN: edge PoPs → origin shield → origin; shield prevents thundering herds on cache miss
Cache-busting (versioned filenames) > URL purges for static assets; surrogate keys for content-tagged purges
Cloudflare Workers: <1ms cold start via V8 isolates; suitable for auth, routing, A/B at the edge
Lambda@Edge: slower cold start (~200ms) but deeper AWS integration; better for complex origin modification
DDoS: anycast distributes attack traffic; rate limiting at PoP prevents origin overload
Cache API responses with short TTLs (60-300s) to dramatically reduce origin load for read-heavy APIs

Frequently Asked Questions

What is the difference between a CDN and edge computing?

A traditional CDN primarily caches and serves static content (images, CSS, JavaScript, videos) from geographically distributed servers (PoPs) to reduce latency and origin server load. The CDN logic is simple: check if the requested content is cached; if yes, serve it; if no, fetch from origin, cache it, and serve it. CDN behavior is configured via HTTP headers (Cache-Control, Vary) and cannot execute custom business logic. Edge computing extends this model by allowing arbitrary code to run at the CDN PoPs. Cloudflare Workers, Lambda@Edge, and Fastly Compute@Edge execute your JavaScript, WASM, or Python code at hundreds of PoPs, within 5-20ms of users worldwide. This enables: request routing and rewriting before hitting origin, authentication/authorization at the edge (validate JWT without origin round-trip), A/B testing (split traffic at the CDN level), personalization (serve different content based on cookie/geo), and API responses served entirely from the edge without touching origin servers. The performance difference: a dynamic API response that required a 200ms round-trip to us-east-1 origin can be served in 10ms from a Cloudflare PoP 50km from the user — 20× latency reduction.

How does cache invalidation work at CDN scale?

Cache invalidation at CDN scale is challenging because popular content may be cached across hundreds of PoPs globally. When content changes, all cached copies must be updated or invalidated. Approaches in order of reliability: (1) Content hash versioning (cache-busting): embed a content hash in the filename (app.a3f8b2c1.css). The CDN caches with a 1-year TTL. When content changes, the filename changes, so old URLs are never requested — they simply expire via TTL. New deployments generate new filenames. Zero invalidation API calls needed; zero stale content risk. Best approach for static assets. (2) TTL expiration: set Cache-Control: max-age=N. Content goes stale after N seconds and is re-fetched. Simple but imprecise — content may be stale for up to N seconds. Use short TTLs (60-300s) for semi-dynamic content (product prices, inventory status). (3) API-based purge: call the CDN's purge API to invalidate specific URLs immediately. Cloudflare propagates purges to all 300+ PoPs in under 150ms. Used for urgent content updates (breaking news, pricing errors). Scale limitation: purging millions of URLs is slow — use surrogate keys/cache tags instead. (4) Surrogate key tags: tag each cached response with logical identifiers (product:12345, category:electronics). A single API call purges all responses tagged with a key. Fastly and Cloudflare support this. Used to invalidate all pages showing a specific product when its inventory changes.

How does anycast protect against DDoS attacks?

Anycast is a network routing technique where the same IP address is announced from multiple data centers simultaneously. When an attacker sends a DDoS flood to the target IP, BGP routing delivers the traffic to the nearest announcing node from the attacker's perspective. Since a global CDN like Cloudflare announces the same IPs from 300+ PoPs worldwide, a DDoS attack originating from, say, Eastern Europe is automatically routed to Cloudflare's Frankfurt and Warsaw PoPs — without reaching US or Asian PoPs at all. This geographic distribution means a 1 Tbps volumetric DDoS attack is spread across the PoPs nearest the attack's source. If those PoPs have 5 Tbps of capacity combined, the attack is absorbed. The origin servers behind the CDN never see the attack traffic. Additionally, each PoP applies filtering: rate limiting per source IP, signature-based filtering for known attack patterns (SYN flood amplification, UDP reflection), and IP reputation blocklists. The key advantage of anycast over unicast: you cannot overwhelm a single PoP by targeting its IP, because the routing adapts — if one PoP becomes saturated, BGP can withdraw its announcement and reroute traffic to adjacent PoPs.

Companies That Ask This Question

Cloudflare Engineering Interview Guide

Meta Engineering Interview Guide

Snap Engineering Interview Guide

Twitter/X Engineering Interview Guide