Video Transcoding Pipeline Low-Level Design

Video Transcoding Pipeline — Low-Level Design

A video transcoding pipeline converts uploaded video files into multiple formats and resolutions for adaptive bitrate streaming. It must handle large file uploads, parallel encoding, and serving at scale. This design is asked at YouTube, Netflix, TikTok, and any video platform.

End-to-End Pipeline

Upload → S3 (raw) → Transcoding Job Queue → Workers (FFmpeg)
       → S3 (HLS segments) → CDN → Player (adaptive bitrate)

Upload and Job Creation

TranscodingJob
  id              BIGSERIAL PK
  user_id         BIGINT NOT NULL
  title           TEXT NOT NULL
  raw_s3_key      TEXT NOT NULL      -- uploaded original
  status          TEXT DEFAULT 'queued'  -- queued, processing, complete, failed
  progress_pct    INT DEFAULT 0
  created_at      TIMESTAMPTZ
  completed_at    TIMESTAMPTZ

TranscodingVariant
  id              BIGSERIAL PK
  job_id          BIGINT FK NOT NULL
  resolution      TEXT NOT NULL      -- '360p', '720p', '1080p', '4k'
  bitrate_kbps    INT NOT NULL
  s3_key_prefix   TEXT               -- where HLS segments are stored
  status          TEXT DEFAULT 'pending'

def create_transcoding_job(user_id, title, raw_s3_key):
    job = db.insert(TranscodingJob, {
        'user_id': user_id,
        'title': title,
        'raw_s3_key': raw_s3_key,
    })

    # Queue variants based on source resolution
    variants = determine_variants(raw_s3_key)
    for v in variants:
        db.insert(TranscodingVariant, {'job_id': job.id, **v})

    sqs.send_message(QueueUrl=TRANSCODE_QUEUE, MessageBody=json.dumps({
        'job_id': job.id
    }))
    return job

def determine_variants(s3_key):
    # Probe source resolution with ffprobe
    probe = ffprobe(s3_key)
    source_height = probe['height']
    # Only generate variants at or below source resolution
    all_variants = [
        {'resolution': '360p',  'bitrate_kbps': 800,  'height': 360},
        {'resolution': '720p',  'bitrate_kbps': 2500, 'height': 720},
        {'resolution': '1080p', 'bitrate_kbps': 5000, 'height': 1080},
        {'resolution': '4k',    'bitrate_kbps': 15000,'height': 2160},
    ]
    return [v for v in all_variants if v['height'] <= source_height]

Transcoding Worker (FFmpeg)

def transcode_job(job_id):
    job = db.get(TranscodingJob, job_id)
    variants = db.query("SELECT * FROM TranscodingVariant WHERE job_id=%(id)s", {'id': job_id})

    db.execute("UPDATE TranscodingJob SET status='processing' WHERE id=%(id)s", {'id': job_id})

    # Download source from S3 to local temp file
    with tempfile.NamedTemporaryFile(suffix='.mp4') as src_file:
        s3.download_fileobj('media-bucket', job.raw_s3_key, src_file)
        src_path = src_file.name

        for variant in variants:
            output_dir = f'/tmp/{job_id}/{variant.resolution}'
            os.makedirs(output_dir, exist_ok=True)

            # FFmpeg: transcode to HLS
            subprocess.run([
                'ffmpeg', '-i', src_path,
                '-vf', f'scale=-2:{variant.height}',
                '-c:v', 'libx264', '-preset', 'fast',
                '-b:v', f'{variant.bitrate_kbps}k',
                '-c:a', 'aac', '-b:a', '128k',
                '-hls_time', '6',           # 6-second segments
                '-hls_playlist_type', 'vod',
                '-hls_segment_filename', f'{output_dir}/seg%03d.ts',
                f'{output_dir}/playlist.m3u8'
            ], check=True)

            # Upload segments to S3
            prefix = f'videos/{job_id}/{variant.resolution}/'
            upload_directory(output_dir, 'media-bucket', prefix)

            db.execute("""
                UPDATE TranscodingVariant SET status='complete', s3_key_prefix=%(pfx)s
                WHERE id=%(id)s
            """, {'pfx': prefix, 'id': variant.id})

    # Generate master playlist
    generate_master_playlist(job_id, variants)
    db.execute("UPDATE TranscodingJob SET status='complete', completed_at=NOW(), progress_pct=100 WHERE id=%(id)s", {'id': job_id})

Master HLS Playlist

# master.m3u8 tells the player which quality variants are available
def generate_master_playlist(job_id, variants):
    lines = ['#EXTM3U', '#EXT-X-VERSION:3']
    for v in variants:
        cdn_prefix = f'https://cdn.example.com/videos/{job_id}/{v.resolution}/'
        lines.append(
            f'#EXT-X-STREAM-INF:BANDWIDTH={v.bitrate_kbps * 1000},'
            f'RESOLUTION={get_resolution_string(v.resolution)}'
        )
        lines.append(f'{cdn_prefix}playlist.m3u8')

    master_content = 'n'.join(lines)
    s3.put_object(
        Bucket='media-bucket',
        Key=f'videos/{job_id}/master.m3u8',
        Body=master_content.encode(),
        ContentType='application/vnd.apple.mpegurl',
    )

Adaptive Bitrate Playback

# Client player (HLS.js, Video.js, native iOS/Android player):
# 1. Fetches master.m3u8 — learns available quality levels
# 2. Starts with the lowest quality for fast start
# 3. Measures download speed of each 6-second segment
# 4. Switches to higher quality if bandwidth allows; drops quality on congestion
# 5. Never buffers more than 30 seconds ahead

# Player URL served to client:
def get_player_url(job_id):
    return f'https://cdn.example.com/videos/{job_id}/master.m3u8'

Key Interview Points

  • HLS segments enable adaptive bitrate: Splitting video into 6-second .ts segments allows the player to switch quality mid-stream between segments. A 2-hour movie is ~1200 segments per resolution level.
  • Parallel variant encoding: Encode all variants in parallel across multiple workers. A single 1080p encode takes ~2-5 minutes for a 10-minute video on a modern CPU; encoding 360p, 720p, and 1080p in parallel takes the same wall time as the slowest variant.
  • CDN is mandatory for video: A 1080p 10-minute video at 5Mbps bitrate = ~375MB. Serving this from origin servers for 100 concurrent viewers = 37.5 GB/s of bandwidth. CDNs offload >99% of this to edge nodes.
  • Store raw video indefinitely: Transcoding codecs improve over time (AV1 is 40% more efficient than H.264). Keep the original upload so you can re-transcode later without asking users to re-upload.

Video transcoding pipeline and streaming system design is discussed in Netflix system design interview questions.

Video transcoding and media processing design is covered in Snap system design interview preparation.

Video transcoding pipeline and media delivery design is discussed in Google system design interview guide.

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

Scroll to Top