Low Level Design: Chunked Upload Service

What Is a Chunked Upload Service?

A chunked upload service splits a large file into fixed-size pieces and transfers them independently. This enables parallel uploads, reduces the impact of network failures (only the failed chunk needs to be retried), and allows progress tracking at fine granularity. It is the foundation of protocols like S3 Multipart Upload and the TUS open standard. The low-level design must handle chunk coordination, ordering, and final assembly.

Data Model

Two tables are required: one for the upload session and one for individual chunk state.

CREATE TABLE upload_sessions (
    session_id     UUID PRIMARY KEY,
    user_id        BIGINT NOT NULL,
    filename       VARCHAR(512) NOT NULL,
    total_size     BIGINT NOT NULL,
    chunk_size     INT NOT NULL DEFAULT 5242880,  -- 5 MB
    total_chunks   INT NOT NULL,
    storage_key    VARCHAR(1024),
    multipart_id   VARCHAR(256),  -- provider multipart upload ID
    status         ENUM('created', 'uploading', 'assembling', 'complete', 'failed'),
    created_at     TIMESTAMP DEFAULT NOW()
);

CREATE TABLE upload_chunks (
    session_id     UUID REFERENCES upload_sessions(session_id),
    chunk_index    INT NOT NULL,
    size_bytes     INT NOT NULL,
    etag           VARCHAR(256),  -- returned by storage on chunk PUT
    status         ENUM('pending', 'uploaded', 'verified'),
    uploaded_at    TIMESTAMP,
    PRIMARY KEY (session_id, chunk_index)
);

The etag field stores the opaque token returned by the storage layer for each part. It is required for the final multipart completion call.

Core Algorithm: Multipart Upload Workflow

Initiate session: Client POSTs file metadata. Server computes total_chunks = ceil(total_size / chunk_size), creates the session record, and calls the storage API to start a multipart upload, storing the returned multipart_id.
Issue presigned URLs per chunk: Server generates a presigned URL for each part number (1-indexed). These can be issued all at once or on demand.
Client uploads chunks: Client PUTs each chunk to its presigned URL. The storage layer returns an ETag for each successful part. Client reports the ETag back to the server via PATCH /api/sessions/{id}/chunks/{index}.
Server marks chunk uploaded: Server updates upload_chunks.status = uploaded and stores the ETag.
Complete multipart: Once all chunks show status uploaded, server calls the storage CompleteMultipartUpload API with the ordered list of (part_number, ETag) pairs. Storage assembles the final object.
Session finalized: Server sets upload_sessions.status = complete and publishes a completion event.

Chunks can be uploaded in parallel (typically 4–8 concurrent threads) to maximize throughput. S3 requires a minimum part size of 5 MB except for the last part.

Failure Handling

Chunk upload failure: The client retries only the failed chunk using a fresh presigned URL if the original expired. The upload_chunks row remains in pending state until success.
Server crash mid-session: On restart, the server can query all uploading sessions and re-issue presigned URLs for any chunks still in pending state, enabling the client to resume.
Stale multipart uploads: Configure a storage lifecycle rule to automatically abort incomplete multipart uploads after 7 days to avoid storage cost accumulation from abandoned sessions.
ETag mismatch / corruption: Clients should compute an MD5 or CRC32c of each chunk before upload and verify it against the returned ETag where the storage provider exposes a checksum.

Scalability Considerations

Stateless coordination servers: Because all chunk state is in the database and all bytes go directly to object storage, any application server instance can handle any request for a session. No sticky sessions needed.
Presigned URL caching: For large files, generate all presigned URLs in bulk on session creation and cache them in Redis keyed by (session_id, chunk_index) with TTL matching the URL expiry.
Parallel assembly: For very large files where final assembly is slow, some systems write chunks to separate keys and use a server-side copy-compose operation to merge them without re-transferring data.
Throughput targets: A 1 GB file split into 200 x 5 MB chunks uploading 8 in parallel over a 100 Mbps connection completes in roughly 13 seconds of pure transfer time.

Summary

Chunked upload is essential for any file larger than a few megabytes. By splitting uploads into independently tracked parts, the service gains resilience against transient failures, supports parallel transfer, and enables fine-grained progress reporting. The server acts as a lightweight coordinator between the client and the storage provider's native multipart API, keeping complexity low while delivering production-grade reliability.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is chunked upload and why is it used?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Chunked upload is a technique where a large file is divided into smaller fixed-size pieces (chunks) on the client side, and each chunk is uploaded independently to the server. It is used to improve reliability (failed chunks can be retried without re-uploading the entire file), enable parallel uploads for higher throughput, and avoid HTTP timeouts on slow or unstable connections.”
}
},
{
“@type”: “Question”,
“name”: “How do you determine the optimal chunk size for a chunked upload system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Optimal chunk size balances upload reliability and overhead. Smaller chunks (e.g., 1–5 MB) reduce the cost of retrying a failed chunk but increase the number of HTTP requests and metadata overhead. Larger chunks (e.g., 10–100 MB) reduce request overhead but make retries more expensive. A common approach is to use 5–10 MB chunks as a default and allow the client to negotiate chunk size based on detected network speed.”
}
},
{
“@type”: “Question”,
“name”: “How does the server reassemble chunks after all parts are uploaded?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The server tracks which chunks have been received via a metadata store keyed by an upload session ID. Once all chunks are confirmed, a reassembly job (triggered by the client’s complete request or a background worker) concatenates the chunks in order — either by streaming them into final object storage or by using a multipart complete API (like S3’s CompleteMultipartUpload). The metadata record is then updated to mark the upload as finalized.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle duplicate or out-of-order chunk uploads?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each chunk is identified by its upload session ID and a zero-based chunk index. The server uses idempotent write logic — if a chunk with the same session ID and index already exists in storage, the duplicate is discarded. Out-of-order arrivals are fine because chunks are stored by index and only assembled once all are present. A distributed lock or atomic counter on the session record prevents race conditions when multiple chunk uploads arrive simultaneously.”
}
}
]
}