Ingestion Pipeline
Raw video files uploaded by content creators or production teams enter a processing pipeline before they can be streamed:
- Client uploads raw video to S3 via presigned URL (multi-part upload for large files)
- S3 event triggers a Lambda or Kafka message initiating the transcode job
- Transcode job worker pulls the raw file, validates integrity, and begins encoding
Raw files may be 4K ProRes or H.264 at very high bitrates — unsuitable for direct streaming. Transcoding creates delivery-optimized renditions.
Transcode Pipeline and Rendition Ladder
FFmpeg generates multiple renditions (bitrate ladder) from the source file:
- 2160p (4K): ~25 Mbps
- 1080p: ~8 Mbps
- 720p: ~4 Mbps
- 480p: ~2 Mbps
- 360p: ~900 kbps
Codec selection:
- H.264 (AVC): universal compatibility — all devices and browsers support it
- H.265 (HEVC): ~50% smaller file size at equivalent quality; limited browser support without license fees
- VP9: royalty-free, good compression, supported natively in Chrome and Firefox
- AV1: best compression ratio, royalty-free, growing hardware decode support
Transcode jobs run on GPU-accelerated instances (NVIDIA with NVENC) to reduce wall-clock time. For a 2-hour movie, GPU transcoding completes in minutes vs. hours on CPU-only instances.
HLS Packaging
HLS (HTTP Live Streaming) is Apple's adaptive bitrate protocol, now the dominant streaming format:
- Each rendition is segmented into 6-second
.ts(MPEG-2 Transport Stream) segments - A media playlist (
.m3u8) per rendition lists all segment URLs with durations - A master playlist references all renditions with their bandwidth and resolution attributes
The player downloads the master playlist first, selects an appropriate rendition based on initial conditions, then downloads media segments sequentially.
DASH and CMAF
DASH (Dynamic Adaptive Streaming over HTTP) is the ISO standard, preferred in non-Apple environments:
- MPD (Media Presentation Description) XML manifest references media segments
- Segments are fragmented
.mp4(fMP4) rather than.ts
CMAF (Common Media Application Format) unifies HLS and DASH: fMP4 segments work with both HLS (using an updated playlist format) and DASH. A single set of segments can serve both protocols, halving storage and CDN costs.
Adaptive Bitrate Player Behavior
The ABR player implements a throughput-based algorithm:
- Startup: begin at the lowest rendition to minimize time-to-first-frame; ramp up as buffer fills
- Steady state: monitor segment download throughput; if throughput > current rendition's bitrate * 1.5, switch up; if throughput < bitrate * 0.8, switch down
- Buffer health: maintain a target buffer (e.g., 30 seconds ahead); if buffer drops below 10 seconds, force a quality reduction
DRM (Digital Rights Management)
DRM encrypts content so only authorized, authenticated users can decrypt and play it:
- Widevine: Google's DRM — used by Android, Chrome, Firefox
- FairPlay: Apple's DRM — used by iOS, Safari, tvOS
- PlayReady: Microsoft's DRM — used by Edge, Xbox, smart TVs
Multi-DRM platforms (Irdeto, EZDRM, PallyCon) provide a single integration point that handles all three DRM systems. The content key is the same; each DRM system wraps it differently in its license.
DRM Key Management
Content encryption and key delivery flow:
- A content encryption key (CEK) is generated per asset during packaging
- The CEK is used to AES-128 encrypt each segment (CBCS or CENC mode)
- The CEK is stored in a Key Management System (KMS) — never in plaintext on disk
- At playback, the player detects DRM initialization data in the manifest and requests a license from the license server
- The license server authenticates the user (valid subscription, valid session token) and returns a DRM license containing the CEK, wrapped in the DRM system's format
- The DRM trusted execution environment (TEE) on the device decrypts the license and uses the CEK to decrypt segments — the key never leaves the TEE in plaintext
CDN Optimization
Video segments are static files after packaging — ideal CDN objects:
- Set long cache TTLs (e.g., 1 year) on segment files — they are immutable and content-addressable by URL
- Set short TTLs on manifest files (m3u8/MPD) — these change as new content is added to live streams
- Deploy edge PoPs in viewer's regions to minimize segment download latency
- Origin shield (mid-tier CDN cache) sits between edge and origin S3 — popular segments are served from the shield, protecting origin from thundering herd on popular releases
Thumbnail Sprites, Offline Download, and Analytics
Thumbnail sprites: during transcoding, extract one frame per 10 seconds at low resolution. Pack frames into a sprite sheet image. Generate a WebVTT file mapping time codes to x/y positions within the sprite. The player uses this for scrubbing preview thumbnails without downloading video segments.
Offline download: DRM-protected download stores encrypted segments locally. The license is bound to the specific device's DRM identity and carries an expiry time (e.g., 30 days, or 48 hours after first play). This prevents license sharing across devices.
Playback analytics (QoE monitoring): the player SDK emits telemetry events — startup time, buffering events, bitrate switches, playback errors — to a data pipeline. These are aggregated to compute Quality of Experience (QoE) scores per CDN, per ISP, per device type, and per region, driving CDN routing and infrastructure decisions.
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering