What Is a Distributed Build System?
A distributed build system parallelizes the compilation and linking of large codebases across a fleet of worker machines, dramatically cutting build times. Instead of one developer machine running every compiler invocation sequentially, work is split into a DAG of tasks — each task compiling one translation unit or linking one library — and scheduled across many workers. Examples include Bazel with Remote Execution API, Buck2, and Distcc. The design problem is fundamentally about cache-efficient DAG scheduling at scale.
Data Model
-- Build Requests
builds (
id BIGINT PRIMARY KEY,
repo_id BIGINT NOT NULL,
commit_sha CHAR(40),
targets TEXT, -- JSON array of top-level targets
status ENUM('queued','running','success','failed','cancelled'),
requested_by BIGINT,
created_at TIMESTAMP,
finished_at TIMESTAMP
);
-- Build Actions (individual compiler/linker invocations)
actions (
id BIGINT PRIMARY KEY,
build_id BIGINT REFERENCES builds(id),
action_digest CHAR(64), -- SHA-256 of (command + input tree digest)
command TEXT, -- compiler invocation
input_digest CHAR(64), -- Merkle root of input files
status ENUM('pending','cache_hit','running','success','failed'),
worker_id BIGINT,
started_at TIMESTAMP,
finished_at TIMESTAMP
);
-- Action Cache
action_cache (
action_digest CHAR(64) PRIMARY KEY,
output_digest CHAR(64), -- Merkle root of output files
exit_code INT,
stored_at TIMESTAMP,
last_hit_at TIMESTAMP
);
-- Content-Addressable Storage entries
cas_blobs (
digest CHAR(64) PRIMARY KEY,
size_bytes BIGINT,
storage_key TEXT,
ref_count INT DEFAULT 0,
created_at TIMESTAMP
);
-- Workers
workers (
id BIGINT PRIMARY KEY,
hostname VARCHAR(255),
pool VARCHAR(100), -- e.g., linux-x86, linux-arm, mac-arm
cores INT,
status ENUM('idle','busy','draining','offline'),
last_heartbeat TIMESTAMP
);
Core Algorithm / Workflow
- DAG Construction: The client (Bazel, Buck2) analyzes BUILD files and constructs a DAG of actions. Each action is identified by its action_digest: a hash of the command line plus the Merkle digest of all input files. This deterministic key enables caching across users and commits.
- Cache Check: Before scheduling, the scheduler batch-queries the action_cache table. Cache hits return the output digest immediately; the client fetches outputs from CAS without running any compiler. Hit rates of 70-90% are common on mature repos.
- Scheduling: Cache misses are enqueued. The scheduler assigns actions to workers based on: worker pool compatibility (architecture), current load, and data locality (prefer workers that already hold the input CAS blobs in local disk cache). Critical-path actions (blocking the most downstream work) are prioritized.
- Execution: Workers pull inputs from CAS, execute the command in a hermetic sandbox (namespaces + seccomp on Linux), upload outputs to CAS, and report results including stdout/stderr. The action_cache is updated on success.
- Output Assembly: Once all actions for a target complete, the client downloads the final outputs (binaries, libraries) from CAS to the local workspace.
Failure Handling
- Worker failure mid-action: Heartbeat timeout triggers reassignment. Because actions are hermetic and idempotent (same inputs → same outputs), re-execution on another worker is safe.
- CAS write failure: Outputs are uploaded before the action is marked success. If upload fails, the action is retried. Partial uploads are detected by digest mismatch and discarded.
- Cache poisoning: Validate output digests on upload. Workers with anomalous failure rates are quarantined automatically. Support an admin API to invalidate specific action_cache entries.
- Stragglers: Speculative execution — launch a duplicate of the slowest-running action on a second worker after it exceeds 2x the median for that action type. Use the first result that arrives; cancel the other.
Scalability Considerations
- CAS at scale: CAS is a content-addressable blob store. Back it with object storage (S3-compatible). Maintain a metadata index (cas_blobs table) in the database for ref-counting and GC; do not store blobs in the DB itself.
- Action cache sharding: Shard the action_cache by the first N bits of action_digest for horizontal scaling. Use consistent hashing so adding shards minimizes remapping.
- Worker autoscaling: Track queue depth per pool. Scale up workers when depth > workers * cores * 2; scale down when idle workers exceed 20% of the pool for > 5 minutes.
- Network locality: Co-locate CAS storage and worker fleet in the same availability zone to keep input download latency under 50ms for typical translation units. Large monorepos with multi-GB input trees require tiered local disk caches on workers.
Summary
A distributed build system is a cache-first DAG scheduler. The action_digest / content-addressable design is the foundational insight: deterministic hashing of inputs collapses equivalent work across all users and time. The scheduler's job is to maximize cache hit rate and minimize critical-path latency. Hermeticity (sandboxed execution) is non-negotiable — without it, cache correctness cannot be guaranteed and the whole system degrades into an unreliable speedup rather than a correctness tool.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering