Low Level Design: Garbage Collector Internals

Garbage collection is the automatic memory management mechanism used by runtimes like the JVM, .NET CLR, and Go. Understanding its internals is essential for writing low-latency services, diagnosing pauses, and tuning production systems. This post digs into the algorithms, trade-offs, and modern GC implementations.

Reachability and GC Roots

A garbage collector does not track memory deallocations — it tracks reachability. An object is live if it can be reached by following references from a set of GC roots. GC roots are the starting points of the object graph and include: local variables and method parameters on the JVM thread stacks, static fields of loaded classes, active JNI (Java Native Interface) references held by native code, and references in monitor/synchronization objects.

Any object not reachable from any GC root is, by definition, garbage and eligible for collection. The collector never needs to ask "who allocated this?" — it only asks "can anything still reach this?"

Mark-and-Sweep

The classic algorithm has two phases. In the mark phase, the collector traverses the object graph starting from GC roots and marks every reachable object. In the sweep phase, it scans the entire heap and reclaims memory occupied by unmarked (unreachable) objects.

The core problem with naive mark-and-sweep is heap fragmentation. After repeated collection cycles, the free memory consists of many small non-contiguous chunks. Allocation of large objects may fail even when enough total free bytes exist, and the allocator must maintain complex free-lists. This is why mark-and-sweep alone is rarely used in production runtimes.

Mark-Compact

Mark-compact extends mark-and-sweep by adding a compaction phase: after marking, surviving objects are slid toward one end of the heap, eliminating gaps. All pointers to moved objects must be updated. The result is a dense heap with a single contiguous free region, enabling fast bump-pointer allocation.

The cost is high: compaction requires multiple passes over the heap (to compute new addresses, update references, and copy data) and is inherently a stop-the-world operation. It is used in Java’s Serial and Parallel GC for old-generation collections, and is acceptable when throughput matters more than latency.

Copying GC (Semi-Space)

Copying collectors divide the heap into two equal halves: from-space and to-space. At collection time, live objects are copied from from-space into to-space. When copying is complete, from-space is entirely free — no sweep needed, no fragmentation. The roles of the two halves flip for the next cycle.

Allocation is trivially fast: a bump pointer advances through to-space. The major drawback is a 2x memory overhead — half the heap is always empty. This trade-off is acceptable for short-lived objects (young generation), which is exactly where copying GC is universally deployed.

Generational Hypothesis and Generational GC

The generational hypothesis is the empirical observation that most objects die young. Studies across many workloads show that the vast majority of allocated objects become unreachable within milliseconds or within a single function call. This insight drives the design of all modern production GCs.

A generational GC divides the heap into a young generation (new allocations, collected frequently via copying GC — called a minor GC) and an old generation (objects that survived several minor GCs, collected less frequently — called a major or full GC). Objects are promoted from young to old after surviving a configurable number of collections. Minor GCs are fast because they only scan the young generation; the old generation is treated as a root source via remembered sets tracking cross-generational references.

Tri-Color Marking

Tri-color marking is the foundation for concurrent and incremental GC algorithms. Objects are assigned one of three colors:

  • White: not yet visited — either garbage or not yet reached.
  • Gray: discovered (reachable from roots) but not yet fully scanned — its references have not been followed.
  • Black: fully scanned — all references from this object have been followed.

The collector starts by graying all GC roots, then repeatedly picks a gray object, marks it black, and grays any white objects it references. When the gray set is empty, all white objects are unreachable and can be reclaimed.

The challenge with concurrent marking is that the application (mutator) can modify the object graph while the GC is running. A mutator could create a new reference from a black object to a white object and then delete the only gray-to-white reference — making the white object look like garbage even though it’s reachable. Write barriers prevent this: on every reference store, the runtime either re-grays the source object (Dijkstra barrier) or records the overwritten reference (Yuasa barrier), maintaining the tri-color invariant.

Stop-the-World vs. Concurrent GC

A stop-the-world (STW) pause freezes all application threads while the GC runs. This simplifies the algorithm — the object graph cannot change during collection — but causes visible latency spikes. The Serial and Parallel GCs in the JVM are fully STW.

Concurrent marking (used in CMS and G1) runs the mark phase alongside the application. This dramatically reduces pause times but requires write barriers and a final STW remark phase to handle mutations that occurred during concurrent marking. Incremental GC breaks work into small slices interleaved with application execution, bounding individual pauses.

G1GC

G1 (Garbage First), the default JVM GC since Java 9, divides the heap into many equal-sized regions (typically 1–32 MB each). Regions are dynamically assigned as young, old, or humongous (for large objects). This regionalized layout allows G1 to collect the regions with the most garbage first (hence the name), maximizing reclaimed bytes per pause.

G1 performs concurrent marking to build a live-data map, then schedules mixed collections that collect all young regions plus a selection of the most profitable old regions. The key feature is a configurable pause time goal (-XX:MaxGCPauseMillis, default 200ms) — G1 uses pause prediction models to select how many regions to collect within the target. It does not guarantee the goal but attempts to meet it.

ZGC and Shenandoah

Both ZGC (Java 15+, production) and Shenandoah target sub-millisecond pauses regardless of heap size, achieving this via concurrent relocation — objects are moved while the application is running.

ZGC uses colored pointers: metadata bits embedded in the 64-bit pointer itself encode the GC state of the referenced object. Load barriers — code injected at every object reference load — check these bits and perform any needed fixup (forwarding to the new address) on the fly. ZGC’s STW pauses are limited to root scanning and are typically under 1ms even on terabyte heaps.

Shenandoah achieves concurrent relocation differently: it installs a Brooks pointer — an extra forwarding word prepended to every object. During relocation, the Brooks pointer is updated to point to the new copy. Load barriers check this indirection. Both approaches pay a small per-access overhead to eliminate long pauses.

GC Tuning Parameters and Log Analysis

Key JVM GC tuning levers include:

  • -Xms / -Xmx: initial and maximum heap size. Setting them equal avoids resize pauses.
  • -XX:NewRatio: ratio of old to young generation size (default 2 means young = 1/3 of heap).
  • -XX:SurvivorRatio: ratio of eden to each survivor space within young gen.
  • -XX:MaxGCPauseMillis: pause time goal for G1.
  • -XX:GCTimeRatio: target ratio of GC time to application time.
  • -XX:+UseZGC / -XX:+UseShenandoahGC: select the collector.

Enable GC logging with -Xlog:gc*:file=gc.log:time,uptime,level,tags. Key signals to watch: pause duration, allocation rate, promotion rate (young-to-old), and humongous allocation frequency. A rising old-gen after each cycle indicates a memory leak or insufficient heap. Frequent full GCs signal that minor GC promotion is overwhelming the old gen — tune survivor spaces or increase heap.

{ “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “What is the generational hypothesis in garbage collection?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “The generational hypothesis states that most objects die young. GC exploits this by dividing the heap into generations: a young generation (Eden + survivor spaces) and an old generation (tenured). Minor GC collects the young generation frequently using a fast copying algorithm. Objects that survive N minor GC cycles are promoted to the old generation. Major (full) GC collects the old generation infrequently. This dramatically reduces GC overhead compared to collecting the entire heap every time.” } }, { “@type”: “Question”, “name”: “What is tri-color marking and why is it needed for concurrent GC?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Tri-color marking classifies objects as white (not yet visited), gray (discovered but children not yet scanned), and black (fully scanned). The invariant: no black object may reference a white object. This invariant allows concurrent marking: the mutator (application) can continue running while GC marks objects. Write barriers maintain the invariant by intercepting pointer stores — if a black object stores a reference to a white object, the white object is grayed.” } }, { “@type”: “Question”, “name”: “What is the difference between G1GC and ZGC?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “G1GC divides the heap into equal-sized regions and performs incremental mixed collections targeting a pause goal (default 200ms). It is concurrent for marking but stop-the-world for evacuation (copying live objects). ZGC performs concurrent relocation using colored pointer bits and load barriers — the GC can move objects while the application runs, achieving sub-millisecond pauses regardless of heap size. ZGC trades higher throughput overhead (load barriers on every reference read) for much lower latency.” } }, { “@type”: “Question”, “name”: “What causes GC pressure and how do you reduce it?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “GC pressure occurs when the application allocates objects faster than the GC can collect them. Causes: excessive temporary object creation (string concatenation in loops, boxing primitives), large object allocation triggering premature promotion. Reductions: object pooling (reuse expensive objects), avoid autoboxing (use primitive collections like Eclipse Collections or Koloboke), reduce allocation rate (allocate fewer, larger objects), increase heap size, tune survivor space ratios to reduce premature promotion.” } }, { “@type”: “Question”, “name”: “How does a reference queue work for weak/soft/phantom references?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Java provides four reference strengths: strong (normal), soft (collected under memory pressure), weak (collected at next GC), phantom (post-finalization cleanup). When the GC collects a soft/weak/phantom referent, it enqueues the Reference object into a ReferenceQueue. A background thread processes the queue to perform cleanup (close resources, remove from caches). This enables cache implementations that release memory under pressure without memory leaks.” } } ] }

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

Scroll to Top