Video Streaming Platform: Low-Level Design

A video streaming platform (like Netflix or YouTube) must ingest raw video, encode it into multiple quality levels, distribute it globally via CDN, and stream it adaptively to clients with varying bandwidth. The storage, encoding, and delivery pipeline is distinct from regular web application design — video data requires specialized infrastructure at every layer.

Video Ingestion and Encoding

Raw video uploaded by creators is large (a 1-hour 4K video is 50-100GB). The ingestion pipeline: (1) upload the raw video to an S3-compatible object store; (2) trigger an encoding job (AWS Elemental MediaConvert, FFmpeg workers); (3) encode into multiple quality levels (360p, 480p, 720p, 1080p, 4K) using H.264 or H.265 (HEVC — better compression, wider device support); (4) package into adaptive streaming format (HLS — HTTP Live Streaming or MPEG-DASH); (5) push encoded segments to the CDN origin. Encoding is CPU-intensive and takes 1-3x the video duration. Netflix uses parallel encoding workers across thousands of EC2 instances to encode a 2-hour movie in under 30 minutes.

HLS and Adaptive Bitrate Streaming

HLS splits each video into small segments (2-10 seconds each). For each quality level, a separate set of segments is encoded and stored. A master manifest file (m3u8) lists all quality levels and their per-quality manifests. The client player downloads the master manifest, selects an initial quality, downloads the per-quality manifest (which lists segment URLs), and fetches segments sequentially. Adaptive bitrate (ABR) algorithm: the player monitors download speed and buffer level. If download speed drops (network degradation), switch to a lower quality by fetching the same time segment from the lower-quality manifest. If download speed improves, switch to higher quality. This prevents buffering while maximizing quality.

CDN Architecture

Video segments are served from CDN edge nodes, not the origin. Cache hit ratio is critical — a cache miss means the edge fetches from origin, adding latency and origin load. For popular content (new Netflix episodes), CDN hit rates approach 100% — all edges pre-populate the segments. For long-tail content (old videos), hit rates are lower. CDN strategy: (1) push popular content to all edges proactively (pre-warming); (2) use cache hierarchies — edge → regional cache → origin — so a miss at the edge hits a regional cache (faster than origin); (3) geo-route: serve users from the nearest CDN PoP; (4) use video CDNs (Akamai, Fastly, CloudFront) optimized for large binary objects and high throughput.

Metadata and Recommendations

Video metadata (title, description, tags, duration, view count, ratings) is stored in a relational database. The recommendations system is a separate service: offline batch jobs compute recommendation vectors (collaborative filtering, content-based filtering) and store results in a low-latency key-value store (DynamoDB, Redis). When a user opens the home page, the recommendation service reads precomputed recommendations by user_id — O(1) lookup, not real-time computation. Real-time signals (what the user just watched) update recommendations on a short delay (5-15 minutes) rather than immediately — freshness vs. complexity trade-off.

View Count and Analytics

View counts are updated frequently and read frequently — the classic counter scalability problem. Do not write one row per view to the database (deadlock, write amplification). Use: (1) in-memory counter per video in Redis (INCR video_id view_count) — sub-millisecond, handles 100k writes/second per key; (2) periodic batch flush: every 60 seconds, flush Redis counters to the database in a single UPDATE. This limits the write frequency to the database while keeping the Redis counter up-to-date for reads. Aggregate analytics (watch time, retention curves, geographic breakdown) go through a separate analytics pipeline (Kafka → Spark → BigQuery) — never computed from the primary database.

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering

See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

See also: Atlassian Interview Guide

See also: Coinbase Interview Guide

See also: Shopify Interview Guide

See also: Snap Interview Guide

See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top