jvm-runtime-diagnostics

star 0

Triage JVM runtime incidents with stack traces, thread dumps, jcmd, JFR, and memory-pressure evidence. Use when diagnosing deadlocks, analyzing thread dumps, capturing JFR recordings, interpreting `jcmd` output, or classifying runtime symptoms as blocking, contention, memory pressure, or startup failure.

ririnto By ririnto schedule Updated 6/11/2026

name: jvm-runtime-diagnostics description: >- Triage JVM runtime incidents with stack traces, thread dumps, jcmd, JFR, and memory-pressure evidence. Use when diagnosing deadlocks, analyzing thread dumps, capturing JFR recordings, interpreting jcmd output, or classifying runtime symptoms as blocking, contention, memory pressure, or startup failure.

JVM Runtime Diagnostics

Goal

Triage JVM runtime problems with standard JDK diagnostic tools and an evidence-first workflow. Collect the smallest next capture that reduces uncertainty before naming a root cause. Prefer jcmd first for live JVMs, keep jstack and jmap as legacy or narrower-purpose tools, and reserve jhsdb for Serviceability Agent cases such as core dumps or deeper postmortem inspection.

Treat JDK 8, 11, 17, 21, and 25 as the supported LTS reference line for this skill, and confirm runtime-specific command availability on the target JVM before assuming a newer flag or event exists.

Treat JFR as the standard low-overhead path on JDK 11 and later. On JDK 8, do not assume JFR is ordinarily available: verify the exact Oracle JDK 8 commercial-feature and licensing posture before recommending JFR.start or -XX:StartFlightRecording, and prefer thread dumps plus other low-risk captures when that requirement is not clearly satisfied.

Common-Case Workflow

  1. Read the evidence already on hand first: stack trace, logs, thread dump, JFR, or command output.
  2. Identify the dominant symptom: blocking, contention, startup failure, memory pressure, crash, or slow path.
  3. Start with jcmd to confirm the target JVM, the available commands, and the least invasive next capture.
  4. Use jstack or jmap only when you are on an older/legacy workflow or need a specific legacy shape, and use jhsdb when you need SA-style inspection, core-dump analysis, or attach alternatives beyond routine jcmd workflows.

Minimal Setup

Run diagnostics on the same machine as the target JVM and, for attach-based tools, with the same effective user/group as the target process.

Tool-selection baseline:

  • jcmd for normal live-process diagnostics
  • jstack and jmap only when their narrower legacy output shape is specifically useful
  • jhsdb for core dumps, SA debugger flows, or when live attach is not the normal jcmd path

Identify the JVM first:

jcmd -l

[!NOTE]

jcmd -l lists processes visible to the current user context. In container environments, process visibility may be restricted by namespace boundaries. Ensure you are querying from the correct user or namespace context.

First Runnable Commands or Code Shape

Start with the lowest-risk command sequence:

jcmd -l
jcmd <pid> help
jcmd <pid> help Thread.print
jcmd <pid> VM.version
jcmd <pid> VM.command_line
jcmd <pid> VM.flags
jcmd <pid> Thread.print -l

Use when: the symptom is real but you do not yet know which deeper tool is justified.

Ready-to-Adapt Templates

Lightweight first triage:

jcmd -l
jcmd <pid> VM.version
jcmd <pid> VM.flags
jcmd <pid> Thread.print -l

Use when: you have only a vague "the JVM is slow" report or one incomplete stack trace.

Single thread dump with lock detail:

jcmd <pid> Thread.print -l > thread-dump.txt

Use when: the issue looks like blocking, deadlock, starvation, or lock contention and you need the first snapshot.

[!WARNING]

Thread dumps can contain thread names, stack traces, class names, and other runtime details that may expose request paths or internal system structure. Write thread dumps to a restricted diagnostics path rather than a shared working directory, and clean them up after analysis like other sensitive captures.

Low-overhead JFR start for a running JVM:

jcmd <pid> JFR.start name=baseline settings=default disk=true maxage=6h
jcmd <pid> JFR.check

Use when: the issue depends on time-based evidence such as CPU, allocation, I/O, or lock behavior.

Startup-attached JFR:

java -XX:StartFlightRecording=name=startup,settings=profile,filename=/path/to/private-diagnostics/startup.jfr,dumponexit=true -jar app.jar

Use when: the problem happens during startup, very early request handling, or any phase that might be missed if JFR starts later via jcmd.

[!IMPORTANT]

JFR recordings can contain stack traces, class names, request metadata, and other sensitive runtime details. Prefer a private diagnostics directory with restrictive permissions instead of a shared location such as /tmp, and clean up captures promptly after analysis.

For JDK 8, treat JFR as a special-case workflow, not the default path. Verify that the target runtime and operational policy actually permit Flight Recorder before recommending these commands.

Heap-oriented escalation:

jcmd <pid> GC.heap_info
jcmd <pid> GC.class_histogram

[!NOTE]

GC.class_stats was removed in JDK 15 and should not be treated as a current default diagnostic command. Use GC.class_histogram for heap class analysis on modern JVMs.

Use when: the symptom is memory growth, allocation pressure, or suspected heap retention.

Native-memory escalation:

jcmd <pid> VM.native_memory summary

Use when: RSS or container memory keeps growing but heap evidence alone does not explain the pressure.

Important setup note:

  • Native Memory Tracking must be enabled at JVM startup with -XX:NativeMemoryTracking=summary or -XX:NativeMemoryTracking=detail

Legacy jstack thread dump:

jstack -l <pid>

Use when: you need the traditional jstack output shape or are working in an older operational workflow that still documents jstack.

Legacy jmap histogram and dump:

jmap -histo <pid>
jmap -dump:live,format=b,file=heap.hprof <pid>

Use when: you specifically need the standalone jmap form instead of the newer jcmd command family.

[!WARNING]

Heap dumps are highly sensitive artifacts and can contain credentials, tokens, session state, and PII. Write them only to restricted paths, transfer them over approved secure channels, and delete them as soon as the investigation allows.

Postmortem jhsdb core analysis:

jhsdb jstack --exe "$JAVA_HOME/bin/java" --core /path/to/core
jhsdb jmap --exe "$JAVA_HOME/bin/java" --core /path/to/core --heap

Use when: the JVM has already crashed or you need core-file inspection rather than routine live attach.

Validate the Result

Validate the common path with these checks:

jcmd <pid> help Thread.print
jcmd <pid> JFR.check
  • Thread.print -l completes and produces thread state plus lock detail.
  • JFR.check confirms the recording name and (running) status before claiming JFR is active.
  • GC.class_histogram reflects the expected process, not the wrong PID.
  • VM.native_memory confirmed Native Memory Tracking was enabled at startup.
    • Comparing like-for-like captures.

Tool-choice checks (decision rationale before deeper escalation):

  • jstack or jmap chosen with a specific reason not to use the jcmd equivalent.
  • jhsdb chosen for postmortem, core-based, or explicitly SA-oriented case, not normal live-process triage.

Format-Critical Output Shapes

jcmd -l Output (JVM Discovery)

12345 com.example.App /opt/app/app.jar

Read: PID = 12345, main class = com.example.App, launch path = /opt/app/app.jar. Use this PID for all subsequent jcmd <pid> commands.

If no JVMs appear, either no JVM is running, or the current user/namespace cannot see the target process.

Thread.print -l Thread Dump Shape

Each thread block follows this structure:

"http-nio-8080-exec-42" #284 daemon prio=5 os_prio=0 cpu=120.00ms elapsed=320.50s tid=0x00007f9a2c01e800 nid=0x4e03 waiting on condition [0x00007f99c4bfe000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006f1a9d040> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:257)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:453)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1065)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:840)
   Locked ownable synchronizers:
    - None

Key fields per thread:

Field Location How to read it
Thread name "http-nio-8080-exec-42" Application thread naming convention; pool threads show pattern
Thread state java.lang.Thread.State: TIMED_WAITING (parking) See state table below
CPU time cpu=120.00ms Total CPU consumed by this thread since start
Elapsed time elapsed=320.50s Wall-clock time since thread started
Native ID nid=0x4e03 OS-level thread ID for top -H or strace -p correlation
Stack trace indented lines below state Most recent call first; native frames show (Native Method)
Lock info Locked ownable synchronizers: - None means no ReentrantLock held

Thread states and what they mean:

State Normal when... Concern when...
RUNNABLE Thread is actively executing CPU work Many threads RUNNABLE + high CPU = saturation
BLOCKED Waiting to enter a synchronized block Same lock contested by many threads = contention
TIMED_WAITING Sleeping, parked with timeout (Thread.sleep, parkNanos) Stuck in TIMED_WAITING indefinitely = hung operation
WAITING Waiting indefinitely (Object.wait(), LockSupport.park) Never recovers = deadlock or missed signal
NEW / TERMINATED Thread not yet started or already finished Large counts of NEW threads = thread leak

GC.heap_info Output

Min heap alignment: 4096 KiB
G1 Heap:
   num_regions: 512
   Heap size: 1048576K
   Free regions: 200
   Young regions: 180
   Eden regions: 160
   Survivor regions: 20
   Old regions: 132

Read: Total heap size vs free/young/old region distribution. If Free regions drops low during load, heap pressure exists.

GC.class_histogram Output (Top Lines)

num     #instances         #bytes  class name
----------------------------------------------
    1:          48000     15360000  [B
    2:          32000      8320000  [Ljava.lang.Object;
    3:          25000      6000000  com.example.model.Data
    4:           5000      2400000  java.util.concurrent.ConcurrentHashMap$Node

Read: rank → instance count → total bytes → class name. [B = byte arrays, [L...; = object arrays. Focus on top 5–10 classes by bytes. If a single application class dominates, investigate retention in that class.

VM.native_memory summary Output

Native Memory Tracking:

Total: reserved=1024MB, committed=512MB

- Java Heap (reserved=512MB, committed=256MB)
- Class (committed=64MB)
- Thread (committed=48MB # of 24 threads)
- Code (committed=32MB)
- GC (committed=96MB)
- Internal (committed=16MB)
- Symbol (committed=12MB)
- Native Memory Tracking (committed=8MB)
- Compiler (committed=24MB)
- Internationalization (committed=4MB)

Read: If total committed approaches container limit but Java Heap is small, non-heap categories (Code, GC, Thread, Compiler) may be the real pressure source. # of N threads shows live thread count.

JFR.check Output

Recording 1: name=baseline maxage=6 h (running)

Read: Match the recording name, confirm (running) before claiming the capture is active, and verify maxage matches the intended retention window. If the runtime also reports a destination or path, confirm it points to the restricted diagnostics location you intended.

References

If the blocker is... Read...
deciding which jcmd command family to use on a live JVM ./references/jcmd-commands.md
deciding whether JDK 8-era jstack or jmap guidance is still justified ./references/jdk8-legacy-tools.md
capturing repeated thread dumps, starting JFR, or deciding between snapshot and time-based evidence ./references/thread-dumps-jfr.md
using Serviceability Agent tools such as jhsdb for core files or deeper attach workflows ./references/jhsdb.md

Invariants

  • MUST start from currently available evidence such as stack traces, logs, or previous captures.
  • MUST prefer jcmd before legacy jstack or jmap for common runtime diagnostics.
  • MUST reserve jhsdb for cases that truly need Serviceability Agent behavior.
  • MUST choose the smallest next tool that reduces uncertainty.
  • MUST distinguish evidence from guesswork.
  • SHOULD prefer low-risk evidence collection first.
  • MUST explain what each command reveals before suggesting it.
  • MUST NOT recommend heap dumps or deep profiling unless the symptom justifies the operational cost.
  • SHOULD check native-memory evidence before blaming all memory growth on the Java heap.

Common Pitfalls

Anti-pattern Why it fails Correct move
jumping straight to a heap dump high cost and often unnecessary for the first pass start with Thread.print, GC.heap_info, or JFR
reading one thread dump as a full story one snapshot can hide transient blocking or scheduler effects capture repeated dumps and compare them
using jstack or jmap by default current JDK guidance favors jcmd for common attach operations prefer jcmd <pid> help and then the specific subcommand
using jhsdb for ordinary live triage Serviceability Agent attach is heavier and official docs warn about hangs/crashes on detach keep jhsdb for core dumps or SA-specific workflows
assuming RSS growth must be heap growth non-heap memory such as metaspace, code cache, arenas, or thread stacks can dominate use VM.native_memory when the heap story does not fit
collecting evidence without naming the symptom tool choice becomes random classify the symptom first, then choose the smallest matching capture

Scope Boundaries

  • Activate this skill for: stack trace and thread-dump interpretation choosing the next JVM runtime diagnostic command low-risk runtime evidence collection with jcmd and JFR
  • Do not use this skill as the primary source for:
    • GC collector selection or GC logging strategy
    • Java language design or test-structure decisions
    • general JDK packaging and module workflows
Install via CLI
npx skills add https://github.com/ririnto/sinon --skill jvm-runtime-diagnostics
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator