System.gc() Semantics
The System.gc() and Runtime.getRuntime().gc() methods serve as requests for garbage collection. By default, invoking these methods explicitly triggers a Full GC that attempts to reclaim memory from both the young and old generations by collecting discarded objects.
It's crucial to understand that a call to System.gc() comes with a disclaimer: the JVM is not obligated to execute the garbage collector immediately. The JVM implementation decides how to respond to this request. In practice, garbage collection is best left to the JVM's automatic mechanisms and should rarely be triggered manually, except in specific scenarios such as performance benchmarking where you need a clean heap state between test runs.
Example: Manual GC Invocation
public class SystemGCTest {
public static void main(String[] args) {
new SystemGCTest();
System.gc(); // Requests GC; immediate execution is not guaranteed
// Equivalent to Runtime.getRuntime().gc();
// System.runFinalization(); // Forces invocation of finalize() on objects with pending finalization
}
@Override
protected void finalize() throws Throwable {
super.finalize();
System.out.println("SystemGCTest finalize() called");
}
}
The output is nondeterministic; finalize() may or may not print:
SystemGCTest finalize() called
or there may be no output.
Understanding Reachability with Manual GC
Consider the following class with various local variable scopes to see how unreachable objects behave with System.gc().
// Run with: -XX:+PrintGCDetails
public class LocalVarGC {
public void localvarGC1() {
byte[] buffer = new byte[10 * 1024 * 1024]; // 10MB
System.gc();
}
public void localvarGC2() {
byte[] buffer = new byte[10 * 1024 * 1024];
buffer = null;
System.gc();
}
public void localvarGC3() {
{
byte[] buffer = new byte[10 * 1024 * 1024];
}
System.gc();
}
public void localvarGC4() {
{
byte[] buffer = new byte[10 * 1024 * 1024];
}
int value = 10;
System.gc();
}
public void localvarGC5() {
localvarGC1();
System.gc();
}
public static void main(String[] args) {
LocalVarGC local = new LocalVarGC();
local.localvarGC1();
}
}
With JVM flags -Xms256m -Xmx256m -XX:+PrintGCDetails -XX:PretenureSizeThreshold=15m, observe the GC logs (details vary by environment):
localvarGC1(): Thebyte[]buffer survives the GC and is promoted to the old generation because the variablebufferis still in scope (in the local variable table) whenSystem.gc()is called.localvarGC2(): By settingbuffer = null, the array becomes unreachable and is reclaimed during GC.localvarGC3(): Despite exiting the block scope, the array remains referenced because the local variable table slot forbufferis still occupied. The bytecodes reveal that the variable is not cleared until the method returns.localvarGC4(): After the block, the variablevalueis declared, which reuses the same slot in the local variable table that was previously used bybuffer. This effectively makes the array unreachable, so it is collected.localvarGC5(): AfterlocalvarGC1()returns, thebuffervariable is completely out of scope, so the array is fully eligible for collection.
Memory Leaks and OutOfMemoryError
OutOfMemoryError (OOM)
OOM occurs when there is insufficient free memory and the garbage collector cannot reclaim enough space. Common causes:
- Insufficient heap size: The heap may be too small for the application's needs (
-Xms,-Xmx). - Uncollectable large objects: Many large objects with active references accumulate.
- Permenent generation / Metaspace exhaustion: In older JDKs, the permanent generation could fill with classes and interned strings. With Metaspace, similar OOM errors occur (
java.lang.OutOfMemoryError: Metaspace). - Direct memory exhaustion: Off-heap memory can also cause OOM.
Before throwing OOM, the JVM makes a best-effort attempt to reclaim memory, for example by clearing soft references or calling System.gc() internally. However, some allocations (like a single object larger than the heap) will immediately trigger OOM without a GC attempt.
Memory Leaks
A strict memory leak occurs when objects are no longer used by the application but cannot be reclaimed by the GC. More broadly, any bad practice that extends object lifetimes unnecessarily can lead to OOM and is often called a memory leak.
Common sources:
- Singletons: They persist for the application's lifetime and can hold references to external objects, preventing their collection.
- Unclosed resources: Database connections, sockets, and I/O streams must be closed explicitly via
close(). Otherwise, the associated objects remain in memory.
Stop-the-World (STW)
Stop-the-World refers to the pause in application threads during a GC event. During this pause, all application threads are suspended, causing a temporary freeze.
STW is required for root enumeration in reachability analysis. To produce a consistent snapshot of the object graph, the execution must appear frozen; otherwise, changing reference relationships would invalidate the analysis. This pause happens in all garbage collectors, although modern collectors like G1 minimize its duration.
STW is automatically initiated and completed by the JVM. Developers should avoid calling System.gc() because it triggers a Full GC and consequently an STW pause.
Example: Observing STW
public class StopTheWorldDemo {
public static class WorkThread extends Thread {
List<byte[]> list = new ArrayList<>();
public void run() {
try {
while (true) {
for (int i = 0; i < 1000; i++) {
byte[] buffer = new byte[1024];
list.add(buffer);
}
if (list.size() > 10000) {
list.clear();
System.gc(); // Triggers Full GC and STW
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
public static class PrintThread extends Thread {
public final long startTime = System.currentTimeMillis();
public void run() {
try {
while (true) {
long t = System.currentTimeMillis() - startTime;
System.out.println(t / 1000 + "." + t % 1000);
Thread.sleep(1000);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
public static void main(String[] args) {
WorkThread w = new WorkThread();
PrintThread p = new PrintThread();
w.start();
p.start();
}
}
Without the WorkThread, the PrintThread prints roughly every 1 second. With the WorkThread running, the intervals become irregular (e.g., 1.4 seconds), demonstrating the STW pause.
Concurrency and Parallelism in GC
- Concurrency: Multiple tasks make progress within the same time period, potentially interleaving on a single CPU. In GC context, concurrent collection means user threads and GC threads execute simultaneously (though not necessarily in parallel).
- Parallelism: Multiple tasks execute simultaneously on different CPU cores. In GC, parallel collection uses multiple GC threads but still pauses user threads (e.g., ParNew, Parallel Scavenge).
- Serial: Single-threaded GC (e.g., Serial, Serial Old).
- Concurrent Collectors: CMS and G1 allow GC threads to run concurrently with user threads.
HotSpot Algorithm Implementation Details
Root Enumeration
All garbage collectors must pause user threads during root enumeration to ensure a consistent snapshot. Modern JVMs use OopMaps to avoid scanning the entire method area and execution context. When a class is loaded, HotSpot computes the offsets of reference types within objects and records reference locations in stack frames during JIT compilation. This allows the GC to quickly locate GC Roots without a full scan.
Safepoints and Safe Regions
Safepoints are specific points in program execution where GC can safely pause threads. Choosing the right safepoints is critical: too few cause long GC wait times; too many degrade performance. Common safepoints include method calls, loop back edges, and exception throws.
Two approaches to bring all threads to a safepoint:
- Preemptive: Interrupt all threads; if a thread is not at a safepoint, resume it until it reaches one. (No longer used.)
- Voluntary / Polling: Set a global flag; threads periodically check the flag and suspend themselves when at a safepoint.
Safe Regions extend safepoint logic to threads that are not executing, such as sleeping or blocked threads. A thread entering a safe region signals the JVM that it is safe for GC to proceed. When leaving the safe region, the thread checks if GC has finished enumerating roots; if not, it waits.
Remembered Sets and Card Tables
Generational collectors need to handle cross-generation references (e.g., an old-gen object referencing a young-gen object). To avoid scanning the entire old generation, the collector maintains a Remembered Set, which abstracts over the set of pointers from non-collected regions into collected regions.
A common implementation of a remembered set is the Card Table. It maps the heap into fixed-size cards (typically 512 bytes) and uses a byte array to track which cards might contain cross-generation pointers. During a young collection, the collector only scans memory regions indicated by dirty cards, greatly reducing the overhead of root scanning.
Reference Types
Java extends the concept of references beyond strong references to include soft, weak, and phantom references, each with different reachability and GC behavior.
Strong References
- The default reference type (
Object obj = new Object()). - As long as a strong reference exists, the object is never reclaimed, even if an OOM occurs.
- The primary cause of memory leaks.
public class StrongReferenceTest {
public static void main(String[] args) {
StringBuffer str = new StringBuffer("Hello");
StringBuffer str1 = str;
str = null;
System.gc();
System.out.println(str1); // Prints "Hello"
}
}
Soft References
- Collected when memory is insufficiant (before OOM).
- Suitable for memory-sensitive caches.
- Created via
SoftReference<T>.
SoftReference<User> userSoftRef = new SoftReference<>(new User(1, "songhk"));
Example with JVM flag -Xms10m -Xmx10m:
- While memory is sufficient, the soft-referenced object survives GC.
- When memory runs low, the soft reference is cleared before throwing OOM.
Weak References
- Collected at the next GC cycle regardless of available memory.
- Created via
WeakReference<T>. - Suitable for canonicalized mappings (e.g.,
WeakHashMap).
WeakReference<User> userWeakRef = new WeakReference<>(new User(1, "songhk"));
System.gc();
System.out.println(userWeakRef.get()); // Prints null
Phantom References
- The weakest reference type. The
get()method always returnsnull. - Must be used with a
ReferenceQueue. - Purpose: receive a notification when an object is about to be reclaimed.
PhantomReference<Object> phantomRef = new PhantomReference<>(obj, queue);
Finalizer References
- Implement
finalize()behavior. - A
Finalizerthread processes the reference queue, invokingfinalize(). The object is not reclaimed until a second GC cycle. - Not recommended for explicit use.