lading-optimize-review

star 98

Reviews optimization patches using a 5-persona peer review system. Requires unanimous approval backed by benchmarks.

DataDog By DataDog schedule Updated 3/3/2026

name: lading-optimize-review description: Reviews optimization patches using a 5-persona peer review system. Requires unanimous approval backed by benchmarks. argument-hint: "[bench] [fingerprint] [file] [target] [technique]" allowed-tools: Bash(cat:) Bash(sample:) Bash(samply:) Bash(cargo:) Bash(ci/:) Bash(hyperfine:) Bash(/payloadtool:) Bash(tee:) Read Glob Grep context: fork

Optimization Patch Review

A rigorous 5-persona peer review system for optimization patches in lading. Requires unanimous approval backed by concrete benchmark data. Duplicate Hunter persona prevents redundant work.

Role: Judge

Review is the decision-maker. It does NOT record results.

Review judges using benchmarks and 5-persona review, then returns a structured report.

Outcomes

Outcome Votes Action
APPROVED 5/5 APPROVE Return APPROVED report
REJECTED Any REJECT Return REJECTED report

Arguments

This skill requires 5 positional arguments passed by the caller:

Arg Field Example Used for
$ARGUMENTS[0] bench trace_agent cargo criterion --bench flag
$ARGUMENTS[1] fingerprint ci/fingerprints/trace_agent_v04/lading.yaml payloadtool config path
$ARGUMENTS[2] file lading_payload/src/trace_agent/v04.rs report + duplicate check
$ARGUMENTS[3] target V04::to_bytes report
$ARGUMENTS[4] technique buffer-reuse report + duplicate check

If any argument is missing -> REJECT. All 5 are required.

Generate Report ID

Derive the id from the file and technique arguments:

  1. Take the filename stem from $ARGUMENTS[2] (e.g. lading_payload/src/trace_agent/v04.rstrace-agent-v04)
  2. Append the technique $ARGUMENTS[4] (e.g. buffer-reuse)
  3. Join with -trace-agent-v04-buffer-reuse

Use this id in the report.


Phase 1: Benchmark Execution

Step 1: Read Baseline Data

Read the baseline benchmark files captured:

  • /tmp/criterion-baseline.log — micro-benchmark baseline
  • /tmp/baseline.json — macro-benchmark timing baseline
  • /tmp/baseline-mem.txt — macro-benchmark memory baseline

If baseline data is missing -> REJECT. Baselines must be captured before any code change and before this gets invoked.

Step 2: Run Post-Change Micro-benchmarks

cargo criterion --bench $ARGUMENTS[0] 2>&1 | tee /tmp/criterion-optimized.log

Note: Criterion automatically compares against the last run and reports percentage changes.

Compare results — look for "change:" lines showing improvement/regression.

Example output: time: [1.2345 ms 1.2456 ms 1.2567 ms] change: [-5.1234% -4.5678% -4.0123%]

Step 3: Run Post-Change Macro-benchmarks

cargo build --release --bin payloadtool
hyperfine --warmup 3 --runs 30 --export-json /tmp/optimized.json \
  "./target/release/payloadtool $ARGUMENTS[1]"
./target/release/payloadtool "$ARGUMENTS[1]" --memory-stats 2>&1 | tee /tmp/optimized-mem.txt

Statistical Requirements

  • Minimum 30 runs for hyperfine (--runs 30)
  • Criterion handles statistical significance internally
  • Time improvement >= 5% for significance
  • Memory improvement >= 10% for significance
  • Allocation reduction >= 20% for significance

NO EXCEPTIONS

  • "Test dependencies don't work" -> REJECT. Fix dependencies first.
  • "Theoretically better" -> REJECT. Prove it with numbers.
  • "Obviously an improvement" -> REJECT. Obvious is not measured.
  • "Will benchmark later" -> REJECT. Benchmark now.

Phase 2: Five-Persona Review

1. Duplicate Hunter (Checks for Redundant Work)

  • Check .claude/skills/lading-optimize-hunt/assets/db.yaml for $ARGUMENTS[2] + $ARGUMENTS[4] combo
  • File + technique combo not already approved
  • No substantially similar optimization exists
  • If duplicate found -> REJECT with "DUPLICATE: see "

2. Skeptic (Demands Proof)

  • Baseline data verified present
  • Post-change benchmarks executed with identical methodology
  • Hot path verified via profiling (not just guessed)
  • Micro threshold met (>=5% time)
  • Macro threshold met (>=5% time OR >=10% mem OR >=20% allocs)
  • Statistical significance confirmed (p<0.05 or criterion "faster")
  • Improvement is real, not measurement noise
  • Benchmark methodology sound (same config, same machine)

3. Conservative (Guards Correctness)

  • Run ci/validate and validate that it passes completely
  • No semantic changes to output
  • Determinism preserved (same seed -> same output)
  • No .unwrap() or .expect() added (lading MUST NOT panic)
  • No bugs introduced (if bug found -> REJECT with bug details)
  • Property tests exist for changed code

4. Rust Expert (Lading-Specific Patterns)

  • No mod.rs files (per CLAUDE.md)
  • All use statements at file top (not inside functions)
  • Format strings use named variables ("{index}" not "{}")
  • Pre-computation in initialization, not hot paths
  • Worst-case behavior considered, not just average-case
  • No unnecessary cloning or allocation in hot paths

5. Greybeard (Simplicity Judge)

  • Code still readable without extensive comments
  • Complexity justified by measured improvement
  • "Obviously fast" pattern, not clever trick
  • Follows 3-repeat abstraction rule (no premature abstraction)
  • Change is minimal - no scope creep

Phase 3: Kani/Property Test Check

If the optimization touches critical code:

For lading_throttle:

ci/kani lading_throttle

For lading_payload:

ci/kani lading_payload

Kani constraints:

  • Kani proofs are more complete but labor-intensive
  • Kani may not compile complex code
  • Kani runs are EXTREMELY slow for complex code

If Kani fails to run:

  1. Document why (compilation error? timeout?)
  2. Verify comprehensive property tests exist instead
  3. This is acceptable - Kani feasibility varies

Phase 4: Decision

Outcome Votes Action
APPROVED 5/5 APPROVE Return APPROVED report
REJECTED Any REJECT Return REJECTED report

Duplicates, bugs, correctness issues, and missing benchmarks are all rejections. Describe the specific reason in the report's reason field.


Phase 5: Return Report

Review does NOT record results and does NOT create files. Return a structured YAML report to the caller.

Fill in the appropriate template and return the completed YAML:

Verdict Template
approved .claude/skills/lading-optimize-review/assets/approved.template.yaml
rejected .claude/skills/lading-optimize-review/assets/rejected.template.yaml
  1. Read the appropriate template from the .claude/skills/lading-optimize-review/assets/ directory
  2. Fill in placeholders using argument values:
    • id → generated ID (see "Generate Report ID" above)
    • target$ARGUMENTS[2]:$ARGUMENTS[3] (e.g. lading_payload/src/trace_agent/v04.rs:V04::to_bytes)
    • technique$ARGUMENTS[4]
  3. Fill remaining fields with actual benchmark data from the review
  4. Return the filled-in report

Install via CLI
npx skills add https://github.com/DataDog/lading --skill lading-optimize-review
Repository Details
star Stars 98
call_split Forks 16
navigation Branch main
article Path SKILL.md
More from Creator