What is the low-level design of a build system?

A low-level design of a build system covers the data structures and algorithms used to model a dependency graph of build targets, schedule compilation tasks, cache results, and produce artifacts. Core abstractions include Target (a named buildable unit), Rule (the recipe to build it), Action (a concrete command to execute), and Cache (content-addressed storage of prior outputs).

How does Google's Bazel build system work at a low level?

Bazel models the build as a directed acyclic graph (DAG) of targets defined in BUILD files. At a low level it uses a query engine to resolve dependencies, a skyframe parallel evaluation framework to compute and cache node values, a local and remote execution layer to run actions in sandboxed environments, and a content-addressed action cache keyed on input hashes to skip redundant work.

What data models are needed to design a build system?

Essential data models include: BuildTarget (id, name, type, srcs, deps, outputs), BuildRule (id, target_id, command_template, env_vars), BuildAction (id, rule_id, input_hashes, output_paths, status), CacheEntry (action_key, output_digest, stored_at), and BuildEvent (id, build_id, target_id, event_type, timestamp). The dependency graph is stored as an adjacency list on BuildTarget.

How do build systems like Amazon's and Meta's handle incremental builds?

Incremental build systems detect changes by hashing source file contents rather than relying on timestamps, then walk the dependency DAG to identify which targets are invalidated. Amazon's internal build tooling and Meta's Buck2 both use remote caching: each action is keyed by a Merkle-tree hash of all inputs, so identical actions across engineers or CI runs share cached outputs and avoid redundant compilation.

Low Level Design: Distributed Build System

⏱ 5 min read

What Is a Distributed Build System?

A distributed build system parallelizes the compilation and linking of large codebases across a fleet of worker machines, dramatically cutting build times. Instead of one developer machine running every compiler invocation sequentially, work is split into a DAG of tasks — each task compiling one translation unit or linking one library — and scheduled across many workers. Examples include Bazel with Remote Execution API, Buck2, and Distcc. The design problem is fundamentally about cache-efficient DAG scheduling at scale.

Data Model


-- Build Requests
builds (
  id            BIGINT PRIMARY KEY,
  repo_id       BIGINT NOT NULL,
  commit_sha    CHAR(40),
  targets       TEXT,           -- JSON array of top-level targets
  status        ENUM('queued','running','success','failed','cancelled'),
  requested_by  BIGINT,
  created_at    TIMESTAMP,
  finished_at   TIMESTAMP
);

-- Build Actions (individual compiler/linker invocations)
actions (
  id            BIGINT PRIMARY KEY,
  build_id      BIGINT REFERENCES builds(id),
  action_digest CHAR(64),       -- SHA-256 of (command + input tree digest)
  command       TEXT,           -- compiler invocation
  input_digest  CHAR(64),       -- Merkle root of input files
  status        ENUM('pending','cache_hit','running','success','failed'),
  worker_id     BIGINT,
  started_at    TIMESTAMP,
  finished_at   TIMESTAMP
);

-- Action Cache
action_cache (
  action_digest CHAR(64) PRIMARY KEY,
  output_digest CHAR(64),       -- Merkle root of output files
  exit_code     INT,
  stored_at     TIMESTAMP,
  last_hit_at   TIMESTAMP
);

-- Content-Addressable Storage entries
cas_blobs (
  digest        CHAR(64) PRIMARY KEY,
  size_bytes    BIGINT,
  storage_key   TEXT,
  ref_count     INT DEFAULT 0,
  created_at    TIMESTAMP
);

-- Workers
workers (
  id            BIGINT PRIMARY KEY,
  hostname      VARCHAR(255),
  pool          VARCHAR(100),   -- e.g., linux-x86, linux-arm, mac-arm
  cores         INT,
  status        ENUM('idle','busy','draining','offline'),
  last_heartbeat TIMESTAMP
);

Core Algorithm / Workflow

DAG Construction: The client (Bazel, Buck2) analyzes BUILD files and constructs a DAG of actions. Each action is identified by its action_digest: a hash of the command line plus the Merkle digest of all input files. This deterministic key enables caching across users and commits.
Cache Check: Before scheduling, the scheduler batch-queries the action_cache table. Cache hits return the output digest immediately; the client fetches outputs from CAS without running any compiler. Hit rates of 70-90% are common on mature repos.
Scheduling: Cache misses are enqueued. The scheduler assigns actions to workers based on: worker pool compatibility (architecture), current load, and data locality (prefer workers that already hold the input CAS blobs in local disk cache). Critical-path actions (blocking the most downstream work) are prioritized.
Execution: Workers pull inputs from CAS, execute the command in a hermetic sandbox (namespaces + seccomp on Linux), upload outputs to CAS, and report results including stdout/stderr. The action_cache is updated on success.
Output Assembly: Once all actions for a target complete, the client downloads the final outputs (binaries, libraries) from CAS to the local workspace.

Failure Handling

Worker failure mid-action: Heartbeat timeout triggers reassignment. Because actions are hermetic and idempotent (same inputs → same outputs), re-execution on another worker is safe.
CAS write failure: Outputs are uploaded before the action is marked success. If upload fails, the action is retried. Partial uploads are detected by digest mismatch and discarded.
Cache poisoning: Validate output digests on upload. Workers with anomalous failure rates are quarantined automatically. Support an admin API to invalidate specific action_cache entries.
Stragglers: Speculative execution — launch a duplicate of the slowest-running action on a second worker after it exceeds 2x the median for that action type. Use the first result that arrives; cancel the other.

Scalability Considerations

CAS at scale: CAS is a content-addressable blob store. Back it with object storage (S3-compatible). Maintain a metadata index (cas_blobs table) in the database for ref-counting and GC; do not store blobs in the DB itself.
Action cache sharding: Shard the action_cache by the first N bits of action_digest for horizontal scaling. Use consistent hashing so adding shards minimizes remapping.
Worker autoscaling: Track queue depth per pool. Scale up workers when depth > workers * cores * 2; scale down when idle workers exceed 20% of the pool for > 5 minutes.
Network locality: Co-locate CAS storage and worker fleet in the same availability zone to keep input download latency under 50ms for typical translation units. Large monorepos with multi-GB input trees require tiered local disk caches on workers.

Summary

A distributed build system is a cache-first DAG scheduler. The action_digest / content-addressable design is the foundational insight: deterministic hashing of inputs collapses equivalent work across all users and time. The scheduler's job is to maximize cache hit rate and minimize critical-path latency. Hermeticity (sandboxed execution) is non-negotiable — without it, cache correctness cannot be guaranteed and the whole system degrades into an unreliable speedup rather than a correctness tool.