name: docker-jfr-benchmark-loop description: Run a repeatable RDF4J performance loop against one JMH benchmark in Docker with Linux Java 26 and JFR CPU-time profiling. Use when working in this repo on benchmark-guided performance changes, hotspot triage, JFR reading, CPU bottleneck analysis, or repeated baseline, fix, and rerun loops. Trigger on requests mentioning benchmark, profiling, JFR, hotspot, perf loop, CPU bottleneck, or Docker benchmark runs in RDF4J.
Docker JFR Benchmark Loop
Use this skill for one-benchmark perf work in this repo. Default runner: scripts/run-docker-jfr-loop.sh, not ad hoc Maven or raw JMH commands.
Quick start
Dry-run a known selector:
.codex/skills/docker-jfr-benchmark-loop/scripts/run-docker-jfr-loop.sh \
org.eclipse.rdf4j.model.benchmark.ValueCreationBenchmark.createBNode \
--dry-run
Explicit selector plus params:
.codex/skills/docker-jfr-benchmark-loop/scripts/run-docker-jfr-loop.sh \
--module core/model \
--class org.eclipse.rdf4j.model.benchmark.ValueCreationBenchmark \
--method createBNode \
--param samples=1000
Repo-grounded defaults
- The wrapper calls repo helper
scripts/run-single-benchmark-docker.sh. - The Docker helper already forces Linux Java 26 plus
--enable-jfr --enable-jfr-cpu-times. - The inner helper already enforces:
settings=profiledumponexit=trueduration=120swarmup=0measurement=10iterations of10sforks=1jdk.CPUTimeSample#enabled=truereport-on-exit=cpu-time-hot-methods
- This skill wrapper adds the missing fidelity flags:
-XX:FlightRecorderOptions=stackdepth=1024,samplethreads=true-XX:+UnlockDiagnosticVMOptions-XX:+DebugNonSafepoints
Core loop
- Pick one benchmark selector. Keep selector, params, Docker image, and profiling flags constant for the whole comparison.
- Capture a baseline run with
scripts/run-docker-jfr-loop.sh. - Read the
.jfrusing references/jfr-reading.md. - Choose one candidate fix with material CPU share. Prefer the fix most likely to move total runtime, not just local self time.
- Re-run the exact same selector and params.
- Compare benchmark delta plus hotspot shift.
- Repeat until:
- the hotspot shifts,
- CPU share falls below a meaningful threshold,
- or GC / locks / memory / I/O / JIT behavior dominates instead.
Operating rules
- Route benchmark variation through
--param. - Route JVM tuning through
--jvm-arg. - Do not use raw
--jmh-argduring JFR runs; the helper rejects extra JMH args when JFR is enabled. - Do not switch to a second
StartFlightRecording; stay aligned with repo helper behavior. - Do not treat a small benchmark gain as proof the hotspot fix failed. First check whether another bottleneck surfaced and now caps the total speedup.
- If the run is not CPU-bound, say so and pivot to lock, GC, memory, I/O, or JIT evidence instead of forcing a CPU-only story.
Output contract
Answers using this skill should usually include:
- exact run command
.jfrpath- top CPU hotspots
- confidence notes
- next-fix hypothesis
- whether a second bottleneck likely masks part of the gain
Escalate cleanly
- Use high-performance-java for algorithm, data-structure, and code-shape changes.
- Use hotspot-jit-forensics when inlining, tiering, C2, or codegen behavior is in question.
- Use jmh-benchmark-compare for multi-run diffs, history, and exported reports.
Reference map
- Loop workflow: references/perf-loop.md
- JFR reading guide: references/jfr-reading.md