Video Transcoding Pipeline — Low-Level Design
A video transcoding pipeline converts uploaded video files into multiple formats and resolutions for adaptive bitrate streaming. It must handle large file uploads, parallel encoding, and serving at scale. This design is asked at YouTube, Netflix, TikTok, and any video platform.
End-to-End Pipeline
Upload → S3 (raw) → Transcoding Job Queue → Workers (FFmpeg)
→ S3 (HLS segments) → CDN → Player (adaptive bitrate)
Upload and Job Creation
TranscodingJob
id BIGSERIAL PK
user_id BIGINT NOT NULL
title TEXT NOT NULL
raw_s3_key TEXT NOT NULL -- uploaded original
status TEXT DEFAULT 'queued' -- queued, processing, complete, failed
progress_pct INT DEFAULT 0
created_at TIMESTAMPTZ
completed_at TIMESTAMPTZ
TranscodingVariant
id BIGSERIAL PK
job_id BIGINT FK NOT NULL
resolution TEXT NOT NULL -- '360p', '720p', '1080p', '4k'
bitrate_kbps INT NOT NULL
s3_key_prefix TEXT -- where HLS segments are stored
status TEXT DEFAULT 'pending'
def create_transcoding_job(user_id, title, raw_s3_key):
job = db.insert(TranscodingJob, {
'user_id': user_id,
'title': title,
'raw_s3_key': raw_s3_key,
})
# Queue variants based on source resolution
variants = determine_variants(raw_s3_key)
for v in variants:
db.insert(TranscodingVariant, {'job_id': job.id, **v})
sqs.send_message(QueueUrl=TRANSCODE_QUEUE, MessageBody=json.dumps({
'job_id': job.id
}))
return job
def determine_variants(s3_key):
# Probe source resolution with ffprobe
probe = ffprobe(s3_key)
source_height = probe['height']
# Only generate variants at or below source resolution
all_variants = [
{'resolution': '360p', 'bitrate_kbps': 800, 'height': 360},
{'resolution': '720p', 'bitrate_kbps': 2500, 'height': 720},
{'resolution': '1080p', 'bitrate_kbps': 5000, 'height': 1080},
{'resolution': '4k', 'bitrate_kbps': 15000,'height': 2160},
]
return [v for v in all_variants if v['height'] <= source_height]
Transcoding Worker (FFmpeg)
def transcode_job(job_id):
job = db.get(TranscodingJob, job_id)
variants = db.query("SELECT * FROM TranscodingVariant WHERE job_id=%(id)s", {'id': job_id})
db.execute("UPDATE TranscodingJob SET status='processing' WHERE id=%(id)s", {'id': job_id})
# Download source from S3 to local temp file
with tempfile.NamedTemporaryFile(suffix='.mp4') as src_file:
s3.download_fileobj('media-bucket', job.raw_s3_key, src_file)
src_path = src_file.name
for variant in variants:
output_dir = f'/tmp/{job_id}/{variant.resolution}'
os.makedirs(output_dir, exist_ok=True)
# FFmpeg: transcode to HLS
subprocess.run([
'ffmpeg', '-i', src_path,
'-vf', f'scale=-2:{variant.height}',
'-c:v', 'libx264', '-preset', 'fast',
'-b:v', f'{variant.bitrate_kbps}k',
'-c:a', 'aac', '-b:a', '128k',
'-hls_time', '6', # 6-second segments
'-hls_playlist_type', 'vod',
'-hls_segment_filename', f'{output_dir}/seg%03d.ts',
f'{output_dir}/playlist.m3u8'
], check=True)
# Upload segments to S3
prefix = f'videos/{job_id}/{variant.resolution}/'
upload_directory(output_dir, 'media-bucket', prefix)
db.execute("""
UPDATE TranscodingVariant SET status='complete', s3_key_prefix=%(pfx)s
WHERE id=%(id)s
""", {'pfx': prefix, 'id': variant.id})
# Generate master playlist
generate_master_playlist(job_id, variants)
db.execute("UPDATE TranscodingJob SET status='complete', completed_at=NOW(), progress_pct=100 WHERE id=%(id)s", {'id': job_id})
Master HLS Playlist
# master.m3u8 tells the player which quality variants are available
def generate_master_playlist(job_id, variants):
lines = ['#EXTM3U', '#EXT-X-VERSION:3']
for v in variants:
cdn_prefix = f'https://cdn.example.com/videos/{job_id}/{v.resolution}/'
lines.append(
f'#EXT-X-STREAM-INF:BANDWIDTH={v.bitrate_kbps * 1000},'
f'RESOLUTION={get_resolution_string(v.resolution)}'
)
lines.append(f'{cdn_prefix}playlist.m3u8')
master_content = 'n'.join(lines)
s3.put_object(
Bucket='media-bucket',
Key=f'videos/{job_id}/master.m3u8',
Body=master_content.encode(),
ContentType='application/vnd.apple.mpegurl',
)
Adaptive Bitrate Playback
# Client player (HLS.js, Video.js, native iOS/Android player):
# 1. Fetches master.m3u8 — learns available quality levels
# 2. Starts with the lowest quality for fast start
# 3. Measures download speed of each 6-second segment
# 4. Switches to higher quality if bandwidth allows; drops quality on congestion
# 5. Never buffers more than 30 seconds ahead
# Player URL served to client:
def get_player_url(job_id):
return f'https://cdn.example.com/videos/{job_id}/master.m3u8'
Key Interview Points
- HLS segments enable adaptive bitrate: Splitting video into 6-second .ts segments allows the player to switch quality mid-stream between segments. A 2-hour movie is ~1200 segments per resolution level.
- Parallel variant encoding: Encode all variants in parallel across multiple workers. A single 1080p encode takes ~2-5 minutes for a 10-minute video on a modern CPU; encoding 360p, 720p, and 1080p in parallel takes the same wall time as the slowest variant.
- CDN is mandatory for video: A 1080p 10-minute video at 5Mbps bitrate = ~375MB. Serving this from origin servers for 100 concurrent viewers = 37.5 GB/s of bandwidth. CDNs offload >99% of this to edge nodes.
- Store raw video indefinitely: Transcoding codecs improve over time (AV1 is 40% more efficient than H.264). Keep the original upload so you can re-transcode later without asking users to re-upload.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why must video processing be asynchronous and queue-based rather than synchronous?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Transcoding a 1-hour 4K video to multiple HLS variants takes 15-45 minutes of CPU time. A synchronous HTTP request would timeout (typical limit: 30-60 seconds). The upload endpoint must return immediately with a job_id (202 Accepted), and the client polls or uses a webhook to check status. The job queue provides backpressure: if 100 videos are uploaded simultaneously and you have 10 worker instances, the queue absorbs the spike — workers process 10 videos at a time, the other 90 wait. Without a queue, a traffic spike would either crash the workers (OOM from 100 concurrent FFmpeg processes) or require over-provisioning for peak load.”}},{“@type”:”Question”,”name”:”What is HLS adaptive bitrate streaming and why is it the standard format?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”HLS (HTTP Live Streaming) splits a video into small segments (2-6 seconds each) encoded at multiple bitrates: 1080p/8Mbps, 720p/4Mbps, 480p/2Mbps, 360p/1Mbps. A master playlist (.m3u8) lists all available variants; each variant has its own playlist listing its segments. The player starts at a low bitrate, monitors download speed, and automatically switches to a higher bitrate when bandwidth allows. This eliminates buffering: if a user’s connection drops from 10Mbps to 2Mbps, the player switches to the 480p variant seamlessly. HLS is supported natively by all modern browsers, iOS, Android, and smart TVs. Alternative: DASH (similar concept, different format) — HLS has broader native support.”}},{“@type”:”Question”,”name”:”How do you generate multiple quality variants in parallel?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A single FFmpeg command can output multiple variants: ffmpeg -i input.mp4 -map 0:v -map 0:a -c:v libx264 -crf 23 -b:v 4000k output_720p.mp4 -c:v libx264 -crf 28 -b:v 1000k output_360p.mp4. This processes the input once while encoding multiple outputs. For very large files, split the job: one worker handles 1080p and 720p (CPU-intensive, GPU-accelerated if available), another handles 480p and 360p. Use FFmpeg with hardware acceleration (-hwaccel nvenc on NVIDIA GPUs) to reduce transcoding time by 5-10x. Distribute variant jobs to a worker pool; track completion per variant in the DB and update the master playlist only when all variants complete.”}},{“@type”:”Question”,”name”:”How do you track transcoding progress and notify the client?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”FFmpeg outputs progress to stderr: frame=1234 fps=24 bitrate=4000kb/s time=00:01:23. Parse this output in real-time by reading the subprocess stderr line by line. After each progress line, update the Job table: UPDATE Job SET progress=45 WHERE job_id=:id. For client notification: (1) Polling — client polls GET /jobs/{id} every 5 seconds and checks progress field. Simple, works for any client. (2) WebSocket — client subscribes to job updates; server pushes progress events as they arrive. Better UX for upload flows. (3) Webhook — on completion, POST to a pre-registered callback URL. Best for server-to-server integrations. Implement polling first; add WebSocket if UX requires real-time progress bars.”}},{“@type”:”Question”,”name”:”How do you handle transcoding failures and partial outputs?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Mark the job as FAILED with an error_message from FFmpeg’s exit code and stderr. Implement retry logic with a limit (max 3 attempts) and exponential backoff. Before retrying, check if the source file is still in S3 (it may have been garbage-collected). For partial outputs (transcoder crashed mid-job): detect by checking if all expected variant playlists exist in S3 and all segments are present. Clean up partial S3 objects before retrying to avoid serving corrupt HLS playlists. Add a dead-letter queue: after 3 failed attempts, move the job to a DLQ for manual inspection. Alert on DLQ depth > 0 — transcoding failures are often caused by corrupt source files that need human review.”}}]}
Video transcoding pipeline and streaming system design is discussed in Netflix system design interview questions.
Video transcoding and media processing design is covered in Snap system design interview preparation.
Video transcoding pipeline and media delivery design is discussed in Google system design interview guide.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering