Video Streaming Platform (YouTube/Netflix) Low-Level Design

Upload Pipeline

Raw video upload → object storage (S3 raw bucket) → message published to transcoding queue (SQS/Kafka) → transcoding workers process in parallel → output multiple quality levels (360p, 720p, 1080p, 4K) in HLS/DASH format → store segments in S3 CDN bucket → update VideoMetadata with manifest URLs.

VideoUploadFlow:
1. Client: multipart upload to presigned S3 URL (raw-videos bucket)
2. S3 event triggers: publish to transcoding-jobs SQS queue
3. Transcoding workers (one per quality): pull job, run FFmpeg, push HLS segments to S3
4. On completion: UPDATE videos SET status='READY', manifest_url=? WHERE video_id=?
5. CDN invalidation (if re-encoding existing video)

Adaptive Bitrate Streaming (ABR)

Video is split into 4-second segments. The player downloads a manifest (M3U8 for HLS, MPD for DASH) listing all quality levels and segment URLs. The player monitors download speed and buffer level, automatically switching to lower quality when bandwidth drops and higher quality when it recovers — no buffering interruptions.

  • HLS (Apple): .m3u8 master playlist + per-quality playlists + .ts segments
  • DASH (standard): .mpd manifest + .mp4 segments (fMP4)
  • Video is playable at 360p as soon as that quality finishes transcoding — don’t wait for 4K

Data Model

Video(video_id, uploader_id, title, description, tags[], duration_sec,
      status ENUM(PROCESSING,READY,FAILED), thumbnail_url, created_at)

VideoStream(stream_id, video_id, quality ENUM(360p,720p,1080p,4K),
            manifest_url, segment_count, bitrate_kbps, status)

VideoView(view_id, video_id, viewer_id, watched_seconds, device_type, created_at)

Transcoding Architecture

Each transcoding job fans out into parallel workers — one per quality level. Each worker:

  1. Downloads raw video from S3 to local disk
  2. Runs FFmpeg: ffmpeg -i input.mp4 -vf scale=-2:720 -c:v h264 -hls_time 4 -hls_playlist_type vod output_720p.m3u8
  3. Uploads segments and manifest to S3 (CDN bucket)
  4. Marks VideoStream record as READY

Workers are stateless EC2 spot instances. Queue depth auto-scales worker count. SQS visibility timeout = max transcoding time (30 minutes) to prevent double-processing.

View Count at Scale

Do not write to the DB on every video view — viral videos get millions of views per hour. Use Redis: INCR view_count:{video_id} on each view event. Periodically (every 60 seconds), a batch job reads all dirty counters and flushes to DB: UPDATE videos SET view_count = view_count + delta WHERE video_id = X. Mark counters as flushed. If Redis restarts, rebuild from DB. Same pattern for like counts.

CDN Delivery

  • Video segments are immutable — serve with Cache-Control: max-age=31536000 (1 year)
  • Manifest files may update (as qualities become available) — short TTL (60s) or versioned URLs
  • Byte-range requests: clients request specific byte ranges for seeking; CDN must support range request pass-through
  • Hot content cached at edge PoPs globally; cold/long-tail content served from origin storage

Recommendations

Collaborative filtering runs offline nightly: users who watched video A also watched videos B, C, D. Stored as precomputed lists: recommendations:{video_id} → sorted list of related video IDs. Served from Redis cache at playback. Personalized recommendations: matrix factorization on user-video interaction matrix (implicit feedback: watch percentage, replays, likes). Computed offline, top-N stored per user in a recommendations table.

Elasticsearch index: video_id, title (analyzed), description (analyzed), tags, uploader_id, view_count, published_at. Query: multi-match on title (weight 3x) + description + tags. Boost by view_count (log scale). Filter: status=READY only. Sync via Kafka consumer on video status changes.

Key Design Decisions

  • HLS/DASH ABR: never force a fixed quality — let the player adapt to network conditions
  • Fan-out transcoding: process all quality levels in parallel, surface lowest quality first
  • Redis view counts: never write per-view to MySQL — batch flush every 60 seconds
  • Immutable segments: content-addressed URLs with max-age=1yr enable aggressive CDN caching

Netflix system design is a canonical video streaming interview topic. See common questions for Netflix interview: video streaming platform system design.

Google/YouTube system design covers video upload, transcoding, and delivery. Review patterns for Google interview: YouTube video streaming system design.

Amazon system design covers video streaming and CDN delivery. See design patterns for Amazon interview: Prime Video streaming system design.

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

Scroll to Top