name: build-jvm-analysis description: "JVM runtime analysis. Use when: profiling JVM performance, tuning GC, debugging memory leaks, finding dead code with production data. Not for general Java development or syntax questions."
JVM Analysis
Patterns for analyzing, optimizing, and debugging JVM applications — both static and runtime.
Philosophy
- Measure before tuning — profile first, optimize based on evidence
- Static + runtime complement — static finds structural issues, runtime finds behavioral issues
- Production-safe tooling — low overhead profilers that won't crash or slow production
- Understand tool limitations — safepoint bias, entry point coverage, what tools can't see
- Right tool for the job — no single tool does everything
Static Analysis
Dead Code Detection
Finding unused code?
├── Have production traffic → Runtime analysis (Scavenger)
├── Need static-only analysis →
│ ├── Simple → ProGuard -printusage
│ └── Custom analysis → SootUp or ProGuard Core
└── Just bug patterns → SpotBugs, Error Prone
| Tool | Type | What It Does | When to Use |
|---|---|---|---|
| Scavenger | Runtime | Tracks actual usage in production | Have production data |
| ProGuard -printusage | Static | Lists unreachable from entry points | Know your entry points |
| SootUp | Library | Call graphs, data flow analysis | Building custom analysis |
| ProGuard Core | Library | Bytecode analysis primitives | Building custom tools |
| Dead Code Agent | Runtime | Tracks class loading | Quick prototype |
Key insight: No turnkey "run this, get dead code" CLI exists. You either:
- Deploy runtime agents (needs production) — Scavenger
- Configure static analysis (needs entry points) — ProGuard
- Build custom tooling (needs engineering) — SootUp
SootUp for Code Analysis
SootUp provides call graph construction and analysis primitives. Useful for:
- Building reachability analysis (find what's called from entry points)
- Data flow analysis (track how values propagate)
- Dependency mapping (what calls what)
(Karakaya et al., TACAS 2024)
Gotcha: SootUp is a library, not a tool. You build analysis on top of it.
Bug Detection (Not Dead Code)
| Tool | Focus | Integration |
|---|---|---|
| SpotBugs | Bug patterns (400+) | Gradle/Maven, CI |
| Error Prone | Compile-time checks | javac plugin |
| NullAway | Null safety | Error Prone plugin |
Runtime Analysis
Profiler Selection
Need production profiling?
├── YES → CPU or memory?
│ ├── CPU → Need flame graphs?
│ │ ├── YES → async-profiler
│ │ └── NO → JFR (built-in, zero config)
│ └── Memory → JFR (allocation profiling)
└── NO (development only) → VisualVM or IntelliJ Profiler
| Profiler | Overhead | Safepoint-Free | Output | Tradeoff | Best For |
|---|---|---|---|---|---|
| async-profiler | ~2% CPU | Yes | Flame graphs, JFR | Requires native agent attachment | Production CPU/allocation |
| JFR + JMC | ~1-2% | Partial (improved Java 16+) | Binary events | Less granular CPU data | Continuous monitoring |
| VisualVM | 5-10% | No | Various | Safepoint bias distorts results | Development only |
| IntelliJ Profiler | ~2% | Yes (uses async-profiler) | Flame graphs | IDE dependency | IDE-integrated |
(InfoQ 2025)
Why safepoint-free matters: JVM can only safely inspect threads at safepoints. JVMTI-based profilers (VisualVM, hprof) miss code between safepoints, skewing flame graphs toward safepoint-heavy code. async-profiler uses AsyncGetCallTrace to sample anytime (Wakart 2016).
Garbage Collector Selection
Heap size?
├── < 4 GB → G1 (default since JDK 9)
│ WHY: ZGC/Shenandoah overhead not justified; G1's region-based collection efficient at this scale
├── 4-32 GB → Latency-sensitive?
│ ├── YES → ZGC or Shenandoah
│ │ WHY: Concurrent marking/compaction keeps pauses <10ms regardless of heap size
│ └── NO → G1
│ WHY: G1's mixed collections handle this range well; simpler tuning
└── > 32 GB → ZGC (generational, JDK 21+)
WHY: ZGC's concurrent compaction scales linearly; G1 pauses grow with heap
| Collector | Pause Target | Heap Size | Tradeoff | JDK | Best For |
|---|---|---|---|---|---|
| G1GC | 200ms (tunable) | Any | Pauses scale with heap | 9+ default | General workloads |
| ZGC | <1ms | Large (100GB+) | ~15% throughput cost vs G1 | 15+ prod, 21+ gen | Latency-critical |
| Shenandoah | <10ms | Large | Higher CPU for barriers | 12+ (Red Hat) | Low-latency, older JDKs |
| Parallel | Max throughput | Medium | Stop-the-world only | All | Batch processing |
Why the thresholds:
- <4GB: G1's region-based approach (2048 regions default) works well. ZGC's colored pointers and load barriers add overhead not justified at small scale (Oracle GC Tuning Guide).
- 4-32GB: The "compressed OOPs" boundary. Above 32GB, object pointers expand from 4 to 8 bytes, increasing memory footprint ~20%. ZGC handles this better (Shipilev 2019).
- >32GB: G1's pause times grow with live set size during mixed collections. ZGC's concurrent compaction maintains <1ms regardless (ZGC wiki, Oracle).
Key insight: ZGC generational (JDK 21+) closes the throughput gap — concurrent minor collections reduce allocation pressure (JEP 439).
(Oracle GC Tuning Guide, Shipilev JVM Anatomy Quarks, JEP 439)
Heap Dump Analysis
OOM or suspected leak?
├── Capture dump → -XX:+HeapDumpOnOutOfMemoryError
├── Analyze → Eclipse MAT or HeapHero
│ ├── Run "Leak Suspects" report
│ ├── Check retained heap (not just shallow)
│ └── Path to GC Roots (exclude weak refs)
└── Fix → Collections holding references, static fields, caches without eviction
Thread Dump Analysis
Application hanging or slow?
├── Capture → jstack -l <pid> (or jcmd <pid> Thread.print)
├── Take 3-5 dumps seconds apart
├── Analyze:
│ ├── BLOCKED threads → lock contention
│ ├── WAITING on same monitor → bottleneck
│ └── Same stack across dumps → stuck thread
└── Tools: FastThread.io, TDA, or manual grep
Production Gotchas
Safepoint Bias
- Trap: Traditional profilers (VisualVM, hprof) only sample at safepoints
- Impact: Misleading flame graphs — hot spots skewed to safepoint-heavy code
- Detection: Compare async-profiler output vs traditional profiler
- Fix: Use async-profiler or JFR (Java 16+ with JEP 376)
Why it happens: JVM can only safely inspect thread state at safepoints — points where all threads are known to be in a consistent state. JVMTI's GetStackTrace requires this. Safepoints occur at method returns, loop back-edges, and allocation. Tight loops without allocations may run millions of cycles between safepoints, becoming invisible (Wakart 2015).
(Wakart 2015, async-profiler docs)
DebugNonSafepoints Flag
- Trap: Even async-profiler needs
-XX:+DebugNonSafepointsfor accurate frame resolution - Impact: Inlined methods may not appear in profiles
- Fix: Start JVM with flag, or attach agent early
# At JVM start (recommended)
java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -agentpath:/path/to/libasyncProfiler.so ...
# Late attach works but misses already-compiled methods
Container Memory Limits (OOMKilled)
- Trap: JVM uses more memory than
-Xmx(native memory, metaspace, stacks, codecache) - Impact: Kubernetes kills pod with OOMKilled even when heap looks fine
- Detection:
kubectl describe podshows OOMKilled; native memory tracking shows usage - Fix: Budget 25-30% of container memory for non-heap
# Good: percentage-based, container-aware
java -XX:MaxRAMPercentage=75.0 -XX:+UseContainerSupport ...
# Budget breakdown for 2GB container:
# Heap: ~1.5GB (75%)
# Metaspace: ~100MB (default MaxMetaspaceSize unbounded, set explicitly)
# Thread stacks: ~100MB (100 threads × 1MB default Linux stack)
# CodeCache: ~50MB (240MB reserved, typically uses ~50MB)
# Native/JNI: ~150MB buffer (JDBC drivers, compression libs, etc.)
Why 25-30%: Empirical guidance from production incidents. Exact overhead depends on workload — NMT (Native Memory Tracking) gives precise breakdown for your app (Schatzl, Oracle GC team). Spring Boot apps with web frameworks often need closer to 30%; minimal services can use 20% (Datadog 2024).
(JEP 345, Datadog JVM Container Best Practices)
Heap Dump Performance Impact
- Trap: Heap dumps pause the JVM during capture
- Impact: Pause duration depends on heap size and live objects — observed 1-3 seconds per GB in production (SAP Memory Analyzer docs, Eclipse MAT wiki)
- Fix: Use continuous profiling (JFR) for allocation tracking; reserve dumps for post-mortem or during maintenance windows
Why the pause: Full GC + heap traversal + I/O. The JVM must walk all live objects to write the dump. Parallel GC can speed traversal but I/O often dominates (Eclipse MAT FAQ).
JMH Microbenchmark Pitfalls
- Trap: Dead code elimination, constant folding, insufficient warmup
- Impact: Benchmarks show 10x faster than production
- Detection: Results too good to be true;
-XX:+PrintCompilationshows unexpected inlining - Fix: Use
Blackhole.consume(),@Stateobjects, sufficient warmup
// Wrong: JIT may eliminate this
@Benchmark
public void bad() {
compute(); // No side effects, may be removed
}
// Correct: Blackhole prevents DCE
@Benchmark
public void good(Blackhole bh) {
bh.consume(compute());
}
Tool Selection by Scenario
| Scenario | Primary Tool | Alternative |
|---|---|---|
| Production CPU profile | async-profiler | JFR |
| Allocation hotspots | JFR | async-profiler --alloc |
| Memory leak | Heap dump + MAT | HeapHero |
| Deadlock | jstack -l | JMC thread analysis |
| GC issues | GC logs + GCViewer | JFR |
| Container sizing | NMT + metrics | VisualVM (dev) |
| Microbenchmarks | JMH | (no alternative) |
Quick Reference
Profiling Commands
# async-profiler CPU flame graph
./profiler.sh -d 30 -f flamegraph.html <pid>
# JFR recording (no overhead until dump)
jcmd <pid> JFR.start duration=60s filename=recording.jfr
# Heap dump
jcmd <pid> GC.heap_dump /path/to/dump.hprof
# Thread dump
jcmd <pid> Thread.print > threads.txt
# Native memory tracking
java -XX:NativeMemoryTracking=summary ...
jcmd <pid> VM.native_memory summary
GC Flags
# G1 (default, balanced)
-XX:+UseG1GC -XX:MaxGCPauseMillis=200
# ZGC (ultra-low latency, JDK 21+ generational default)
-XX:+UseZGC
# Shenandoah (low-latency, older JDKs)
-XX:+UseShenandoahGC
# Diagnostics
-Xlog:gc*:file=gc.log:time,tags
Container Flags
# Production container setup
-XX:+UseContainerSupport \
-XX:MaxRAMPercentage=75.0 \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/dumps/ \
-XX:+ExitOnOutOfMemoryError
Specialized References
Load reference based on context:
| Detected | Load |
|---|---|
| Profiling, flame graphs, sampling | profiling.md |
| GC tuning, pause times, heap sizing | gc.md |
| Memory leaks, thread dumps, OOM | debugging.md |
| Kubernetes, containers, cgroups | containers.md |
Obsolete Patterns
| Obsolete | Replacement | Why |
|---|---|---|
-XX:+PrintGCDetails |
-Xlog:gc* |
Unified logging (JDK 9+) |
| VisualVM for production | async-profiler / JFR | Safepoint bias, overhead |
Manual -Xmx in containers |
MaxRAMPercentage |
Container-aware |
jmap -heap |
jcmd GC.heap_info |
jcmd preferred |
| hprof | JFR | hprof removed JDK 9+ |
| CMS | G1 or ZGC | CMS removed JDK 14 |
Anti-Patterns
| Don't | Do | Why |
|---|---|---|
| Profile with default VisualVM in prod | Use async-profiler or JFR | Safepoint bias, overhead |
Set -Xmx = container limit |
Leave 25-30% for non-heap | OOMKilled by cgroup |
| Trust microbenchmarks naively | Use JMH properly | JIT optimizations mislead |
| Tune GC without measuring | Profile first, tune second | Premature optimization |
Use -XX:+PrintGCDetails (JDK 9+) |
Use unified logging -Xlog:gc* |
Old flags deprecated |
| Ignore safepoint bias | Check -XX:+DebugNonSafepoints |
Hidden hot spots |