Low Level Design: File Sharing Service

Overview

A file sharing service lets users upload files, generate shareable links, control access, and manage storage quotas. The design must handle large files reliably, enforce security at the link and permission layers, and keep storage costs predictable through quota enforcement.

File Metadata Schema

The File table tracks metadata; actual bytes live in object storage (S3-compatible). Key fields:

  • id — UUID primary key
  • owner_id — foreign key to users
  • filename — original filename as uploaded
  • content_type — MIME type (e.g. application/pdf)
  • size — file size in bytes
  • storage_path — object storage key (e.g. uploads/2026/04/uuid)
  • checksum — SHA-256 of file content for integrity verification
  • scan_status — ENUM('pending','clean','infected','error')
  • quota_charged_at — timestamp when the file size was counted against the owner quota
  • deleted_at — soft delete; file moves to trash, storage freed after retention period
  • created_at / updated_at

Access Link Generation

Generate a signed share link with these components:

  • Token: a random 128-bit URL-safe token stored in a ShareLink table (token, file_id, created_by, expires_at, password_hash, download_limit, download_count)
  • HMAC signature: optionally sign the URL so the server can verify integrity without a DB lookup for high-traffic links
  • Expiry: expires_at enforced server-side; expired links return 410 Gone
  • Password protection: store bcrypt hash of the optional link password; require the password in a pre-download form

Link format example: https://share.example.com/f/{token}. On access, resolve token to file_id, check expiry, check password if set, check scan_status (block infected files), then redirect to a short-lived pre-signed object storage URL.

Permission Model

Three access levels:

  • private — only the owner can access
  • anyone-with-link — valid share link token grants access
  • specific-users — a FilePermission table (file_id, user_id, role) where role is ENUM('viewer','editor')

Editors can upload new versions; viewers can only download. The owner can change permissions at any time. Revoking a share link marks it as revoked in the ShareLink table — the token becomes invalid immediately without needing to rotate URLs.

Download Tracking and Analytics

Record each download in a DownloadEvent table: id, file_id, share_link_id (nullable), user_id (nullable for anonymous), ip_address, user_agent, downloaded_at, bytes_transferred. This powers:

  • Per-link download counts (enforce download_limit on share links)
  • Owner dashboards showing total downloads, unique visitors, geographic breakdown
  • Abuse detection (anomalous download spikes from a single IP)

For high-volume services, write download events to a queue (Kafka/SQS) and batch-insert to avoid hot row contention on the file record.

Virus Scanning

Never serve a file before it has been scanned. Flow:

  • Upload completes; file is stored with scan_status = pending
  • Upload triggers an async job (queue message) to the scanner worker
  • Scanner pulls the file from object storage, runs ClamAV (or cloud AV API), updates scan_status to clean or infected
  • Download requests check scan_status: pending returns 202 Accepted with a retry header; infected returns 403 Forbidden

For infected files: notify the owner, quarantine the object in a separate storage bucket, and log the event for compliance.

Large File Handling

Chunked upload: split files larger than ~5 MB into chunks client-side. Use the S3 multipart upload API (or equivalent): initiate upload → upload parts in parallel → complete upload. Store an UploadSession record tracking which parts have been received. If the session is interrupted, the client can resume by fetching the list of completed parts and uploading only the missing ones.

Resumable download: support HTTP Range requests. The object storage pre-signed URL can include a byte range. For large files, the client issues Range: bytes=0-4999999 and resumes with the next range after interruption. Return 206 Partial Content with Content-Range header. This also enables video streaming and in-browser PDF preview without downloading the full file.

Storage Quota Enforcement

Each user has a quota record: user_id, used_bytes, soft_limit_bytes, hard_limit_bytes.

  • Hard limit: reject uploads that would exceed it. Check before accepting the upload; return 507 Insufficient Storage.
  • Soft limit: allow the upload but send a warning notification and surface an upgrade prompt in the UI.
  • Quota accounting: increment used_bytes atomically when upload completes (set quota_charged_at on the File row). Use a database transaction to update quota and mark the file as fully uploaded together.
  • Trash and recovery: files moved to trash are still counted against quota until permanently deleted. After a configurable retention period (e.g. 30 days), a background job hard-deletes the file, removes the object from storage, and decrements used_bytes.

To prevent quota drift, run a nightly reconciliation job that sums file sizes for each owner and compares against the quota record, alerting on discrepancies greater than a threshold.

FAQ: File Sharing Service

What is a file sharing service in system design?

A file sharing service allows users to upload, store, and distribute files to other users or the public via links. Core components include an upload ingestion layer, blob storage backend (e.g., S3), metadata database, access control system, link generation service, and a CDN for download acceleration. The service must handle files ranging from kilobytes to gigabytes, enforce per-user storage quotas, support granular permissions, and generate time-limited or password-protected share links.

How do you generate secure access links for shared files?

Secure share links are generated by creating a token that encodes the file ID, expiry timestamp, and an HMAC signature using a server-side secret key. When a request arrives, the server verifies the HMAC to ensure the token was not tampered with, checks the expiry, and optionally validates an IP or password constraint before issuing a redirect to a pre-signed object storage URL. Pre-signed URLs from S3 or GCS can be used as the final delivery mechanism, giving time-limited direct access to the blob without proxying traffic through the application server.

How do you implement storage quota enforcement per user?

Each user has a quota record storing their limit and current usage. On upload initiation, the service checks whether the incoming file size would exceed the remaining quota before accepting any data, returning a 402 or 413 error if the quota would be breached. Usage is updated atomically (using a database transaction or atomic counter) when the upload completes and rolled back if it fails. Deleted files decrement usage only after the blob is confirmed removed from storage. Periodic reconciliation jobs verify counter accuracy against actual stored bytes to correct any drift from edge cases.

How do you handle large file uploads and resumable downloads?

Large file uploads use multipart upload protocols (e.g., S3 multipart upload or TUS protocol) that split the file into chunks, upload each independently, and assemble them server-side. The client can resume an interrupted upload by re-uploading only the missing chunks identified by their offset and checksum. For downloads, the server supports HTTP Range requests, allowing clients to fetch specific byte ranges. Download managers and browsers use this to resume interrupted downloads or parallelize chunk fetching. Chunk checksums (MD5 or SHA-256) validate integrity for each part before final assembly.

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Atlassian Interview Guide

Scroll to Top