How does generational garbage collection work?

Generational GC exploits the fact that most objects die young (the generational hypothesis). The heap is divided into: Young Generation (newly allocated objects, collected frequently via minor GC -- fast because most objects are already dead) and Old Generation (objects that survived multiple minor GCs, collected infrequently via major GC -- slower because more live objects). JVM layout: Eden (new allocations) + two Survivor spaces = Young Gen. After surviving ~15 minor GCs, objects are promoted to Old Gen. Minor GC typically takes 5-50ms. Major GC can take 100ms to seconds. Tuning: size the young generation large enough that short-lived objects (request-scoped) die before promotion, reducing old generation pressure and major GC frequency.

Which JVM garbage collector should you use?

G1 GC (default since Java 9): general-purpose, targets configurable pause times (default 200ms). Best for moderate heap sizes (4-32GB) and typical web applications. ZGC: sub-1ms pauses regardless of heap size (up to 16TB). Best for latency-sensitive services (real-time bidding, trading, interactive APIs). Available since Java 15. Shenandoah: similar to ZGC with sub-millisecond pauses. Available in OpenJDK. Parallel GC: maximizes throughput using multiple GC threads. Higher pauses but better total throughput. Best for batch processing. Serial GC: single-threaded. Best for small heaps under 100MB. For most production web services, start with G1. Switch to ZGC if P99 latency spikes from GC pauses are unacceptable.

How do you detect memory leaks in garbage-collected languages?

Memory leaks in GC languages occur when objects are reachable but never used. Common causes: unbounded caches (maps that grow forever), unremoved event listeners, static collections, ThreadLocal values in thread pools, and unclosed connections. Detection: (1) Monitor heap usage over time -- a steadily increasing trend after GC indicates a leak. (2) Take a heap dump: jmap -dump:live,format=b,file=heap.hprof. (3) Analyze with Eclipse MAT or VisualVM -- look at the dominator tree for large objects retaining many other objects. (4) Compare two heap dumps taken at different times -- objects that grew significantly are likely the leak. GC logs: enable with -Xlog:gc*. Red flags: frequent full GCs, increasing heap occupancy after GC, and promotion rate exceeding expectations.

How does Go garbage collection differ from JVM GC?

Go uses a concurrent tri-color mark-and-sweep collector with no generational separation. Design philosophy: simplicity over throughput. Go GC pauses are typically under 1ms because most work is concurrent (running alongside the application). The GC triggers when the heap grows to 2x the live data size (controlled by GOGC=100 environment variable). Go has minimal tuning knobs: GOGC (target heap growth ratio) and GOMEMLIMIT (soft memory limit since Go 1.19). No generational separation means Go scans the entire heap every cycle, but the concurrent design keeps pauses short. JVM difference: JVM offers many collectors (G1, ZGC, Parallel) with extensive tuning parameters. JVM generational GC is more throughput-efficient for typical workloads but requires more configuration. Go trades some throughput for consistently low latency and zero-configuration simplicity.

System Design: Garbage Collection and Memory Management — JVM GC, Mark-Sweep, Generational GC, Tuning, Memory Leaks

⏱ 8 min read

Garbage collection determines when and how unused memory is reclaimed. Understanding GC internals helps you debug memory leaks, tune application performance, and answer system design questions about latency-sensitive services. This guide covers GC algorithms, JVM garbage collectors, GC tuning strategies, and memory leak detection — essential for senior backend engineering and performance engineering interviews.

Why Garbage Collection Matters for System Design

GC pauses directly impact application latency. A GC pause of 200ms in a service with a 100ms P99 latency target causes a latency spike that violates the SLO. In distributed systems, GC pauses can trigger cascading failures: a service pauses for GC, misses heartbeats, is marked as unhealthy by the load balancer, traffic shifts to remaining instances, overloading them. In consensus systems (Raft, ZooKeeper), a GC pause on the leader can cause a leader election, disrupting the entire cluster. Understanding GC behavior is critical for: choosing the right GC collector for your workload, sizing heap memory appropriately, diagnosing latency spikes, and designing systems that tolerate GC pauses. Languages with GC: Java (JVM), Go, Python, C#, JavaScript (V8). Languages without GC: C, C++ (manual memory management), Rust (ownership system, no runtime GC).

Mark-and-Sweep Algorithm

The foundational GC algorithm. Two phases: (1) Mark — starting from root references (stack variables, static fields, CPU registers), traverse all reachable objects and mark them as alive. Any object not reachable from a root is garbage. (2) Sweep — scan the entire heap and free memory occupied by unmarked (dead) objects. The mark phase uses graph traversal (DFS or BFS) to follow references from live objects to other live objects. Time: O(L) where L is the number of live objects (not total objects — dead objects are not traversed). The sweep phase is O(H) where H is the heap size (must scan every allocated block). Problem: stop-the-world — the application must be paused during marking to prevent reference graph changes. If an object is moved or a reference is updated during marking, the GC may incorrectly classify a live object as dead. This pause scales with heap size and number of live objects. A 10GB heap with millions of live objects can pause for seconds.

Generational Garbage Collection

The generational hypothesis: most objects die young. A web request creates temporary objects (request parsing, response building) that become garbage when the request completes. Long-lived objects (caches, connection pools) are a minority. Generational GC exploits this by dividing the heap into generations: Young Generation — newly allocated objects. Collected frequently (minor GC). Most objects die here, so minor GC is fast (small heap region, mostly garbage). Old Generation (Tenured) — objects that survived multiple minor GCs. Collected infrequently (major GC). Major GC is slower (larger region, more live objects). JVM default heap layout: Eden space (where new objects are allocated) + two Survivor spaces (objects that survive one minor GC are copied here) = Young Generation. After surviving N minor GCs (default N=15), objects are promoted to the Old Generation. Minor GC (young generation) typically takes 5-50ms. Major GC (old generation) can take 100ms-several seconds. Tuning strategy: size the young generation large enough that most short-lived objects die before promotion. This reduces pressure on the old generation and minimizes major GC frequency.

JVM Garbage Collectors

The JVM offers multiple collectors for different workloads: (1) G1 GC (Garbage-First) — the default since Java 9. Divides the heap into equal-sized regions (1-32MB). Collects regions with the most garbage first (hence “garbage-first”). Targets a configurable pause time (default 200ms). Mixes young and old generation collection incrementally. Best for: general-purpose applications with moderate heap sizes (4-32GB). (2) ZGC (Z Garbage Collector) — designed for ultra-low latency. Pause times under 1ms regardless of heap size (tested up to 16TB). Uses colored pointers and load barriers for concurrent marking and compaction. Best for: latency-sensitive applications (real-time bidding, trading systems, interactive services). Available since Java 15. (3) Shenandoah — similar goals to ZGC (sub-millisecond pauses). Uses Brooks pointers for concurrent compaction. Available in OpenJDK. (4) Parallel GC — maximizes throughput by using multiple threads for GC. Higher pause times but better total throughput. Best for: batch processing where throughput matters more than latency. (5) Serial GC — single-threaded. Best for: small heaps (<100MB) and single-core machines.

GC Tuning Strategies

Tuning goals: minimize GC pause frequency and duration while maintaining throughput. Key parameters (G1 GC): (1) Heap size (-Xms, -Xmx) — set min and max equal to avoid resizing overhead. Size: too small causes frequent GC, too large causes long pauses. Start with 2-4x the live data set size. (2) Pause time target (-XX:MaxGCPauseMillis=200) — G1 adjusts its behavior to meet this target. Lower values mean more frequent, shorter collections. (3) Region size (-XX:G1HeapRegionSize) — affects allocation and collection granularity. Auto-calculated based on heap size. (4) Young generation size — G1 auto-tunes this to meet the pause target. Override with -XX:NewRatio if needed. Monitoring: enable GC logging (-Xlog:gc*:file=gc.log). Analyze with GCViewer or GCEasy. Key metrics: GC pause duration (P99), GC frequency, allocation rate, promotion rate, and heap occupancy after GC. Red flags: frequent full GCs (heap too small or memory leak), increasing heap occupancy after GC (memory leak — objects accumulate but are never freed), and long pauses (heap too large, wrong collector choice).

Memory Leaks in Garbage-Collected Languages

GC does not prevent memory leaks — it prevents dangling pointers. A memory leak in a GC language occurs when objects are kept reachable but never used. Common causes: (1) Growing collections — a cache map that adds entries but never removes them. Over time, the map consumes the entire heap. Fix: use a bounded cache (LRU eviction) or weak references. (2) Listener/callback registration — registering an event listener without unregistering it when the object is no longer needed. The event system holds a reference, preventing GC. (3) Static fields — static collections that accumulate data over the application lifetime. (4) ThreadLocal variables — values stored in ThreadLocal are not GC-eligible until the thread ends. In thread pools (web servers), threads live forever, so ThreadLocal values accumulate. Fix: call ThreadLocal.remove() after use. (5) Connection leaks — database connections or HTTP connections that are opened but not closed. The connection object remains in the pool, and the pool grows. Detection: take a heap dump (jmap -dump:live,format=b,file=heap.hprof) and analyze with Eclipse MAT or VisualVM. Look for dominator trees: large objects that retain many other objects. Track heap usage over time — a steadily increasing trend after GC indicates a leak.

Go and Python GC Compared

Go GC: a concurrent, tri-color mark-and-sweep collector with no generational separation. Go trades throughput for low latency: GC pauses are typically under 1ms because most work is done concurrently while the application runs. The GC triggers based on the GOGC environment variable (default 100): GC runs when the heap grows to 2x the live data size. Go has no tuning knobs beyond GOGC — simplicity over configurability. Since Go 1.19, soft memory limits (GOMEMLIMIT) provide better control. Python GC: uses reference counting as the primary mechanism (each object has a counter tracking references to it; when the count drops to zero, the object is freed immediately). A cyclic garbage collector handles reference cycles (two objects referencing each other with no external references). The cyclic GC runs periodically based on allocation thresholds. Python GC pauses are generally short but unpredictable. The GIL (Global Interpreter Lock) means only one thread executes Python code at a time, limiting true parallelism but simplifying GC. For CPU-bound workloads, Python applications use multiprocessing (separate processes with separate heaps and GCs) to achieve parallelism.