name: adversarial-review-change description: Change-mode review procedure for the adversarial-reviewer agent. Fires when the target is a code diff, GitHub PR URL, or working-tree diff. Runs pipeline S1 → S2 → S4 → S7 → S8 (S3 steelman is opt-in).
Adversarial Review — Change Mode
You were dispatched here by the agent's turn-1 mode dispatch because the target is a code diff, PR URL, or working-tree. This skill defines the full pipeline. Execute it without deviation.
Vocabulary, constants, evidence labels, verdict scale, and pushback protocol are defined in ~/.pi/agent/agents/adversarial-reviewer.md. Do not redefine them here — use them exactly as specified there.
Sections 4 (Citation Shepardizing) and 7 (ACH — Competing Hypotheses) are research-mode-only — change-mode artifacts omit them entirely.
Input Acquisition (Execute Before S1)
Determine what diff to work from. Exactly one of these applies:
| Target type | How to acquire |
|---|---|
| Working-tree diff vs. base ref | git diff <base>...HEAD (default base: main); if no base supplied, use git diff main...HEAD |
| GitHub PR URL | gh pr diff <PR-URL> — fetches the full unified diff |
Explicit .diff / .patch file path |
read the file directly |
After acquiring the diff, also read --context <path>... files if the flag was supplied. These are non-diff files (e.g., related modules, interface definitions, docs) provided for grounding.
Context starvation rule (critical): Diffs strip surrounding functions. Objecting against stripped context is the #1 false-positive source in change-mode. If --context files were NOT supplied, read the full surrounding functions for every changed hunk before proceeding to S2 and S4. Use read with offset/limit targeting the relevant file and line range visible in the diff headers.
Pipeline: S1 → S2 → [S3 optional] → S4 → S7 → S8
S1 — Restate the Change
What it produces: A plain-language summary of the change in the reviewer's own words — not a paraphrase of the commit message.
How to execute:
- State: what files are touched, what the change does functionally, what the apparent intent is
- Note what is NOT touched that you expected to be (missing migrations, missing tests, absent callers, etc.)
- Do not evaluate yet — this step is purely descriptive
Output format:
## Change Summary (S1)
<2–4 sentences: what changed, which files, apparent intent, notable omissions>
S1 informs the Bottom Line (Section 2 of the artifact) — it is not a standalone artifact section.
S3 — Steelman (opt-in, --steelman flag only)
Insert between S1 and S2 only when --steelman is passed.
What it produces: The strongest defensible version of this change — the argument for why it is correct, safe, and necessary.
How to execute:
- Articulate the best case for the change: what problem it solves, why this approach is reasonable, what constraints it operates under
- Do not hedge the steelman — state it confidently. This is the strongest form of the argument you're about to attack.
Output format:
## Steelman (S3)
<3–6 sentences: the best version of the argument for this change>
Steelman is not an artifact section in the final output — it is a working step that sharpens subsequent objections. Include it in the working notes but not in the final composed artifact sections.
S2 — Key Assumptions Check
What it produces: Surfaced implicit assumptions the change makes about surrounding code, callers, data shapes, user behavior, concurrency, and environment.
How to execute:
- If no
--contextfiles were supplied: read the full surrounding functions for every changed hunk now (if you haven't already). Usereadwith file path and line range from the diff headers. - Identify assumptions the change makes that are not validated in the diff:
- What does the changed code assume about its callers? (argument types, call frequency, concurrency)
- What does it assume about data shapes? (null safety, array bounds, field presence)
- What does it assume about environment? (env vars, file system state, external services)
- What does it assume about user behavior? (ordering, retry logic, idempotency expectations)
- Assign each assumption a confidence that the assumption is actually true given what you can read in the codebase. This becomes the
impact-if-falsecolumn.
Output format (→ Section 5, Key Assumptions Surfaced):
| Assumption | Confidence it holds | Impact if false |
|------------|---------------------|-----------------|
| <what the change assumes> | <% range> | <what breaks> |
S4 — Pre-mortem: Production Incident Opener
What it produces: An unfiltered list of speculative failure modes. This is the highest-value step in change mode.
Epistemics: Pre-mortem items are hypotheses, not claims-of-fact. They do NOT require [FETCHED]/[CODEBASE] evidence labels. Inclusion criterion is looser than objections: "plausible enough to be worth defending against" — not the TIER_2_CONFIDENCE_FLOOR floor that governs objections.
Framing prompt for this step:
Assume this change shipped to production and caused an incident. Why? Generate the unfiltered list of plausible mechanisms.
Failure mode categories to sweep (do not skip):
- Null/empty/zero inputs hitting the changed code path
- Off-by-one errors in loops, indexes, ranges, pagination
- Race conditions, missing locks, double-writes, TOCTOU
- Error handling gaps — exceptions swallowed, partial failures silently succeeding
- Security holes: injection vectors (SQL, command, template), auth bypass, secret leakage, SSRF
- Performance cliffs: N+1 queries, unbounded loops, missing indexes on new queries, large payload assumptions
- Breaking API changes: callers that weren't updated, wire format changes, serialization assumptions
- Missing tests: what invariant is now unverifiable in CI?
- Dependency-introduced behavior: what does the library actually do at edge cases the author may not have checked?
Format each item as:
**PM<N>: <hypothetical incident title>**
Mechanism: <1–2 sentences — what code path fails, under what condition>
Prevention: <what code or check would have caught this>
Hard rule: Do NOT promote pre-mortem items to Tier-1 objections unless they independently meet the objection evidence + confidence bar (TIER_1_CONFIDENCE_FLOOR). Pre-mortem and Objections have different epistemic weight — gaming the system by cross-promoting defeats the purpose of keeping them separate.
Output format (→ Section 6, Pre-mortem Failure Modes):
## Pre-mortem Failure Modes (S4)
> Assume this change caused a production incident. Speculative — distinct from evidenced objections.
**PM1: <title>**
Mechanism: ...
Prevention: ...
**PM2: <title>**
...
S7 — Structured Objection List
What it produces: Tiered, evidenced objections per D8 intensity calibration.
How to execute:
- For each potential objection, assign an evidence label from the taxonomy in the agent file (
[CODEBASE],[FETCHED],[REASONING],[ABSENT],[TRAINING]). In change mode,[CODEBASE]dominates — includefile.ts:42plus verbatim code excerpt (withinFETCHED_QUOTE_MAX_CHARS). - Assign a confidence range.
- Tier the objection:
- Confidence ≥
TIER_1_CONFIDENCE_FLOOR→ Tier 1 (load-bearing; drives verdict) - Confidence
TIER_2_CONFIDENCE_FLOOR–TIER_1_CONFIDENCE_FLOOR→ Tier 2 (secondary) - Below
TIER_2_CONFIDENCE_FLOOR→ dropped; count only in footer
- Confidence ≥
- Apply the Cluster Rule: if 3+ Tier-2 objections point at the same assumption, escalate to a synthetic Tier-1.
- Apply
TIER_1_HARD_CAP: collapse by theme if you would exceed the cap.
[FETCHED] usage in change mode: Use when challenging library or framework behavior the change relies on (e.g., "this assumes Promise.all short-circuits on rejection — per MDN [FETCHED quote] it does not"). Always include verbatim quote ≤ FETCHED_QUOTE_MAX_CHARS. No quote → downgrade to [TRAINING] with confidence cap.
Output format (→ Section 3 Tier-1 + Tier-2, Section 8 Unfalsifiable):
## Tier-1 Objections (Load-Bearing)
**O1: <title>**
[CODEBASE] `src/foo.ts:42` — `<verbatim code excerpt>`
Confidence: <X%–Y%>
Consequence if true: <what breaks in production>
Detail: <reasoning>
**O2: <title>**
...
## Tier-2 Objections (Secondary Concerns)
**O<N>: <title>**
[REASONING] <explicit logic>
Confidence: <X%–Y%>
Consequence if true: ...
Detail: ...
---
*<N> sub-threshold concerns considered and dropped.*
Unfalsifiable items → Section 8 (Unfalsifiable / Out of Scope):
**Declined: <item>** — unfalsifiable value claim; out of scope
S8 — Forced Verdict
What it produces: The verdict, Bottom Line, and mandatory What Would Change the Verdict section.
How to execute:
- Count Tier-1 objections. Apply the four-tier verdict scale from the agent file. Pick exactly one. No hedging.
- Write the Bottom Line: 2–3 sentences referencing only Tier-1 objections. Do not mention Tier-2 here.
- Write What Would Change the Verdict: explicit — what evidence, fix, or test would move this verdict one tier up. This section is mandatory.
Output format:
# Verdict: <SURVIVES | SURVIVES WITH CAVEATS | WOUNDED | DOES NOT SURVIVE>
**Target:** <diff source / PR URL>
**Mode:** change
**Scope:** <scope string or "entire diff">
**Timestamp:** <ISO 8601>
## Bottom Line (S8)
<2–3 sentences. Reference O<N> IDs only from Tier-1. State why the verdict is what it is.>
...
## What Would Change the Verdict (S8)
- Adding <X> would move this from <current verdict> to <next-tier verdict>: <reasoning>
- Demonstrating <Y> would withdraw O<N>: <what evidence is needed>
Output Composition
The final artifact assembles sections in this order:
| Section | Content | Produced by |
|---|---|---|
| 1 — Header | Verdict (first line), target, mode, scope, timestamp, reviewer model | S8 |
| 2 — Bottom Line | 2–3 sentence rationale, Tier-1 refs only | S8 |
| 3 — Objections | Tier-1 + Tier-2 numbered blocks | S7 |
| 4 — Citation Shepardizing | OMITTED in change mode | — |
| 5 — Key Assumptions Surfaced | Assumption table | S2 |
| 6 — Pre-mortem Failure Modes | PM |
S4 |
| 7 — ACH Matrix | OMITTED in change mode | — |
| 8 — Unfalsifiable / Out of Scope | Declined items | S7 |
| 9 — What Would Change the Verdict | Explicit upgrade conditions | S8 |
Write the artifact to: ~/.pi/agent/reviews/<slug>-<timestamp>.md
Print the artifact path at the end of the session.
If --emit-json flag is set OR the agent was spawned by another agent, also write a sidecar JSON at the same path with .json extension.