Low Level Design: Distributed Build System

What Is a Distributed Build System?

A distributed build system parallelizes the compilation and linking of large codebases across a fleet of worker machines, dramatically cutting build times. Instead of one developer machine running every compiler invocation sequentially, work is split into a DAG of tasks — each task compiling one translation unit or linking one library — and scheduled across many workers. Examples include Bazel with Remote Execution API, Buck2, and Distcc. The design problem is fundamentally about cache-efficient DAG scheduling at scale.

Data Model


-- Build Requests
builds (
  id            BIGINT PRIMARY KEY,
  repo_id       BIGINT NOT NULL,
  commit_sha    CHAR(40),
  targets       TEXT,           -- JSON array of top-level targets
  status        ENUM('queued','running','success','failed','cancelled'),
  requested_by  BIGINT,
  created_at    TIMESTAMP,
  finished_at   TIMESTAMP
);

-- Build Actions (individual compiler/linker invocations)
actions (
  id            BIGINT PRIMARY KEY,
  build_id      BIGINT REFERENCES builds(id),
  action_digest CHAR(64),       -- SHA-256 of (command + input tree digest)
  command       TEXT,           -- compiler invocation
  input_digest  CHAR(64),       -- Merkle root of input files
  status        ENUM('pending','cache_hit','running','success','failed'),
  worker_id     BIGINT,
  started_at    TIMESTAMP,
  finished_at   TIMESTAMP
);

-- Action Cache
action_cache (
  action_digest CHAR(64) PRIMARY KEY,
  output_digest CHAR(64),       -- Merkle root of output files
  exit_code     INT,
  stored_at     TIMESTAMP,
  last_hit_at   TIMESTAMP
);

-- Content-Addressable Storage entries
cas_blobs (
  digest        CHAR(64) PRIMARY KEY,
  size_bytes    BIGINT,
  storage_key   TEXT,
  ref_count     INT DEFAULT 0,
  created_at    TIMESTAMP
);

-- Workers
workers (
  id            BIGINT PRIMARY KEY,
  hostname      VARCHAR(255),
  pool          VARCHAR(100),   -- e.g., linux-x86, linux-arm, mac-arm
  cores         INT,
  status        ENUM('idle','busy','draining','offline'),
  last_heartbeat TIMESTAMP
);

Core Algorithm / Workflow

  1. DAG Construction: The client (Bazel, Buck2) analyzes BUILD files and constructs a DAG of actions. Each action is identified by its action_digest: a hash of the command line plus the Merkle digest of all input files. This deterministic key enables caching across users and commits.
  2. Cache Check: Before scheduling, the scheduler batch-queries the action_cache table. Cache hits return the output digest immediately; the client fetches outputs from CAS without running any compiler. Hit rates of 70-90% are common on mature repos.
  3. Scheduling: Cache misses are enqueued. The scheduler assigns actions to workers based on: worker pool compatibility (architecture), current load, and data locality (prefer workers that already hold the input CAS blobs in local disk cache). Critical-path actions (blocking the most downstream work) are prioritized.
  4. Execution: Workers pull inputs from CAS, execute the command in a hermetic sandbox (namespaces + seccomp on Linux), upload outputs to CAS, and report results including stdout/stderr. The action_cache is updated on success.
  5. Output Assembly: Once all actions for a target complete, the client downloads the final outputs (binaries, libraries) from CAS to the local workspace.

Failure Handling

  • Worker failure mid-action: Heartbeat timeout triggers reassignment. Because actions are hermetic and idempotent (same inputs → same outputs), re-execution on another worker is safe.
  • CAS write failure: Outputs are uploaded before the action is marked success. If upload fails, the action is retried. Partial uploads are detected by digest mismatch and discarded.
  • Cache poisoning: Validate output digests on upload. Workers with anomalous failure rates are quarantined automatically. Support an admin API to invalidate specific action_cache entries.
  • Stragglers: Speculative execution — launch a duplicate of the slowest-running action on a second worker after it exceeds 2x the median for that action type. Use the first result that arrives; cancel the other.

Scalability Considerations

  • CAS at scale: CAS is a content-addressable blob store. Back it with object storage (S3-compatible). Maintain a metadata index (cas_blobs table) in the database for ref-counting and GC; do not store blobs in the DB itself.
  • Action cache sharding: Shard the action_cache by the first N bits of action_digest for horizontal scaling. Use consistent hashing so adding shards minimizes remapping.
  • Worker autoscaling: Track queue depth per pool. Scale up workers when depth > workers * cores * 2; scale down when idle workers exceed 20% of the pool for > 5 minutes.
  • Network locality: Co-locate CAS storage and worker fleet in the same availability zone to keep input download latency under 50ms for typical translation units. Large monorepos with multi-GB input trees require tiered local disk caches on workers.

Summary

A distributed build system is a cache-first DAG scheduler. The action_digest / content-addressable design is the foundational insight: deterministic hashing of inputs collapses equivalent work across all users and time. The scheduler's job is to maximize cache hit rate and minimize critical-path latency. Hermeticity (sandboxed execution) is non-negotiable — without it, cache correctness cannot be guaranteed and the whole system degrades into an unreliable speedup rather than a correctness tool.

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

Scroll to Top