A video streaming platform (like Netflix or YouTube) must ingest raw video, encode it into multiple quality levels, distribute it globally via CDN, and stream it adaptively to clients with varying bandwidth. The storage, encoding, and delivery pipeline is distinct from regular web application design — video data requires specialized infrastructure at every layer.
Video Ingestion and Encoding
Raw video uploaded by creators is large (a 1-hour 4K video is 50-100GB). The ingestion pipeline: (1) upload the raw video to an S3-compatible object store; (2) trigger an encoding job (AWS Elemental MediaConvert, FFmpeg workers); (3) encode into multiple quality levels (360p, 480p, 720p, 1080p, 4K) using H.264 or H.265 (HEVC — better compression, wider device support); (4) package into adaptive streaming format (HLS — HTTP Live Streaming or MPEG-DASH); (5) push encoded segments to the CDN origin. Encoding is CPU-intensive and takes 1-3x the video duration. Netflix uses parallel encoding workers across thousands of EC2 instances to encode a 2-hour movie in under 30 minutes.
HLS and Adaptive Bitrate Streaming
HLS splits each video into small segments (2-10 seconds each). For each quality level, a separate set of segments is encoded and stored. A master manifest file (m3u8) lists all quality levels and their per-quality manifests. The client player downloads the master manifest, selects an initial quality, downloads the per-quality manifest (which lists segment URLs), and fetches segments sequentially. Adaptive bitrate (ABR) algorithm: the player monitors download speed and buffer level. If download speed drops (network degradation), switch to a lower quality by fetching the same time segment from the lower-quality manifest. If download speed improves, switch to higher quality. This prevents buffering while maximizing quality.
CDN Architecture
Video segments are served from CDN edge nodes, not the origin. Cache hit ratio is critical — a cache miss means the edge fetches from origin, adding latency and origin load. For popular content (new Netflix episodes), CDN hit rates approach 100% — all edges pre-populate the segments. For long-tail content (old videos), hit rates are lower. CDN strategy: (1) push popular content to all edges proactively (pre-warming); (2) use cache hierarchies — edge → regional cache → origin — so a miss at the edge hits a regional cache (faster than origin); (3) geo-route: serve users from the nearest CDN PoP; (4) use video CDNs (Akamai, Fastly, CloudFront) optimized for large binary objects and high throughput.
Metadata and Recommendations
Video metadata (title, description, tags, duration, view count, ratings) is stored in a relational database. The recommendations system is a separate service: offline batch jobs compute recommendation vectors (collaborative filtering, content-based filtering) and store results in a low-latency key-value store (DynamoDB, Redis). When a user opens the home page, the recommendation service reads precomputed recommendations by user_id — O(1) lookup, not real-time computation. Real-time signals (what the user just watched) update recommendations on a short delay (5-15 minutes) rather than immediately — freshness vs. complexity trade-off.
View Count and Analytics
View counts are updated frequently and read frequently — the classic counter scalability problem. Do not write one row per view to the database (deadlock, write amplification). Use: (1) in-memory counter per video in Redis (INCR video_id view_count) — sub-millisecond, handles 100k writes/second per key; (2) periodic batch flush: every 60 seconds, flush Redis counters to the database in a single UPDATE. This limits the write frequency to the database while keeping the Redis counter up-to-date for reads. Aggregate analytics (watch time, retention curves, geographic breakdown) go through a separate analytics pipeline (Kafka → Spark → BigQuery) — never computed from the primary database.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Atlassian Interview Guide
See also: Coinbase Interview Guide
See also: Shopify Interview Guide
See also: Snap Interview Guide
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems