name: arc-auditing-spec
description: Use when the user explicitly invokes /arcforge:arc-auditing-spec <spec-id> for a read-only advisory audit of an SDD spec family (design, spec, dag, decision anchors). Only triggered by direct user invocation; never auto-invoked from any pipeline skill.
argument-hint: " [--save]"
Skill: arc-auditing-spec
Iron Law
READ-ONLY ADVISORY. NEVER MUTATE.
Skill body and all three sub-agents MUST NOT Edit, Write, rename, delete, or run any mutating git / filesystem operation. Phase 5 prints a Decisions table and exits. Main session (or a subsequent skill) owns any actual edits — not this skill, never.
No --apply flag, no "while I'm here let me fix this typo" shortcut, no Phase 6 that starts applying decisions. Diagnostic role only.
When to Use
- User types
/arcforge:arc-auditing-spec <spec-id>wanting cross-artifact alignment, internal consistency, and state-transition integrity checked across the spec family - The spec has reached a state where at least
design.mdexists (spec.xml and dag.yaml optional; graceful degradation per fr-aa-004)
When NOT to Use
- Source code review, PR review → use
/revieworpr-review-toolkit - You want this skill to apply resolutions — it won't; it's diagnostic only
- You want an automatic check at the end of
arc-refiningorarc-planning— those skills do not and must not auto-invoke this one (fr-sc-001-ac3); invocation is always user-initiated
Invocation Contract
/arcforge:arc-auditing-spec <spec-id> [--save]
<spec-id>— directory name underspecs/. Exact match only; no fuzzy resolution.--save— optional. When present, writes the full Phase 2 report + Phase 5 Decisions table to~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.md(hash from${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js). When absent, no file is written anywhere.
Phase 0 — Precondition Check (MANDATORY)
Before Phase 1 fan-out, verify specs/<spec-id>/ exists as a directory.
If it exists: proceed to Phase 1. No file is written at this point.
If it does NOT exist: STOP. Print:
Error: specs/<spec-id>/ does not exist.
Available spec-ids:
- <id-1>
- <id-2>
...
Then exit non-zero. Write nothing. Spawn no sub-agent. This is the only valid response — see Red Flags for the specific failure modes this rule forbids.
Phase Structure
| Phase | What | Contract |
|---|---|---|
| 0 | Precondition check above | fr-sc-001-ac1, fr-sc-001-ac2 |
| 1 | Parallel fan-out to three read-only sub-agents via Task tool | agents/arc-auditing-spec-*.md; fr-aa-001 |
| 2 | Print Summary + Findings Overview + per-finding Detail markdown | specs/arc-auditing-spec/details/output-and-interaction.xml fr-oi-001 |
| 3 | Triage UX — AskUserQuestion multi-select over HIGH findings | fr-oi-002 |
| 4 | Resolution UX — batched per-finding AskUserQuestion with diff previews | fr-oi-003 |
| 5 | Print Decisions markdown table; skill exits | fr-oi-004 |
Phase 1 — Parallel Fan-Out to Three Audit Axes
You MUST dispatch all three audit agents in a SINGLE message using three parallel Task tool uses. Do NOT dispatch them one at a time. Sequential dispatch is the baseline failure mode this rule exists to prevent — a stock agent defaults to serial execution; this skill forbids it.
Dispatch these three agents concurrently, in a single message:
arc-auditing-spec-cross-artifact-alignmentarc-auditing-spec-internal-consistencyarc-auditing-spec-state-transition-integrity
Phase 1 Prompt Template
Assemble the sub-agent prompt per references/phase1-prompt.md — that file
is the authoritative layout. The prompt carries the spec-id plus resolved
absolute paths to design.md (newest docs/plans/<spec-id>/*/design.md),
spec.xml (specs/<spec-id>/spec.xml), details/*.xml
(specs/<spec-id>/details/), dag.yaml (specs/<spec-id>/dag.yaml),
decisions.yml (specs/<spec-id>/decisions.yml), and
product/vision.md.
When an artifact is missing, substitute the literal absence marker
(absent — file does not exist) verbatim — do not omit the line, do not
invent placeholder paths. The axis agents depend on the marker for their
graceful-degradation branches (fr-aa-004). Absent decisions.yml or
product/vision.md causes the cross-artifact-alignment agent to skip its
spec↔decision↔anchor graph checks (patterns 7–9) without error.
REQUIRED BACKGROUND: skills/arc-auditing-spec/references/phase1-prompt.md
Partial Failure Contract (fr-aa-004-ac3)
When an axis agent returns an error_flag in its findings, that axis has
failed mid-audit. The main session MUST:
- Surface the
error_flagin the Phase 2 Summary table (as a warning row for that axis). - Continue rendering Phases 2–5 using findings from the axes that succeeded.
- NOT halt the audit because one axis encountered an error.
One axis's error_flag does NOT stop the other two axes' findings from being
shown and triaged.
See skills/arc-auditing-spec/references/report-templates.md for full worked
examples with exact column headers. The summary below is the decision logic;
the reference file is the layout authority.
REQUIRED BACKGROUND: skills/arc-auditing-spec/references/report-templates.md
Phase 2 — Markdown Report
Print three sections in order. No omissions regardless of severity.
Section A — Summary table (axis | HIGH | MED | LOW | INFO | Total).
One row per axis plus a Totals row. If an axis returned error_flag, replace
its counts with ERR and note the error below the table.
Section B — Findings Overview table (ID | Sev | Axis | Title | Primary file).
Every finding from all three axes MUST appear — MED, LOW, and INFO findings
appear in this table exactly as HIGH findings do. No omissions.
When exactly one HIGH-severity finding exists across the full finding set
(N_HIGH == 1), the Title cell for that single-HIGH row MUST start with ⚠️
and MUST render the title text in markdown bold: ⚠️ **<title>**. This visual
emphasis ensures the lone HIGH is conspicuous even when Phase 3 triage does not
fire. When N_HIGH is 0 or >= 2, render all Overview rows without the ⚠️
prefix (baseline rendering). The ⚠️ prefix MUST NOT appear in the
per-finding Detail block header — the emphasis is Overview-row-only.
Section C — Per-finding Detail blocks. One block per finding, same order as the Overview. Each block contains:
- Observed evidence as a markdown table (
location | evidence). - "Why it matters" — the ONLY section that may remain free prose.
- Suggested resolutions as a markdown table (
Resolution | Description | Side-effect / Cost). When a resolution has apreviewdiff from the agent, append the diff block under the table row.
MED, LOW, and INFO findings MUST have full Detail blocks rendered here — even though they will not enter triage in Phase 3. All findings require visibility; the Phase 3 multi-select is not the only visibility mechanism.
Phase 3 — Triage UX
Goal: let the reviewer select which HIGH findings to address in this session. MED, LOW, INFO MUST NOT appear as options — only reachable via the Other free-text channel (F-01, pinned). Phase 3 fires only when N_HIGH >= 2; below that threshold the skill takes a degraded path.
Step 0 — Threshold check (MANDATORY before any AskUserQuestion call). Count the HIGH-severity findings across all three axes (N_HIGH).
N_HIGH == 0: Phase 3 does NOT fire. Do NOT issue any AskUserQuestion call. Do NOT enter Phase 4. Do NOT render a Phase 5 Decisions table. Instead, print the concluding recommendation line (template:
references/report-templates.md§Concluding Recommendation Line) and exit cleanly. The Phase 2 Detail blocks are the complete deliverable; no alternative injection channel is provided.N_HIGH == 1: Phase 3 multi-select does NOT fire (a single-option multi-select would violate
options.minItems: 2). Skip Phase 3 entirely. The Phase 2 Findings Overview row for the lone HIGH already carries the visual emphasis from fr-oi-001-ac5. Proceed directly into Phase 4 with that single HIGH as the sole Stage-2 queue entry. Phase 4 then proceeds per its existing rules (fr-oi-003). No Other injection channel exists on this path; the Other pull-in channel only exists when Phase 3 actually fires (N_HIGH >= 2).N_HIGH >= 2: Phase 3 fires. Continue to the steps below.
Step 1 — Determine HIGH finding list. Collect all HIGH-severity findings across all three axes. Sort: A1 before A2 before A3; within each axis sort by NNN ascending.
Step 2 — Batched multi-select loop. Present up to 4 HIGH findings per AskUserQuestion call. When more than 4 HIGH findings exist, make sequential calls batching up to 4 each time until every HIGH finding has been presented exactly once.
Use AskUserQuestion with header: "Triage" and multiSelect: true. Template: see references/report-templates.md §Phase 3.
Step 3 — Parse Other free-text.
The AskUserQuestion tool appends an auto-generated Other field. After each
call, scan the Other string for the regex pattern A[1-3]-\d{3}. Add each
matched finding ID to the Stage 2 resolution queue alongside any HIGH IDs
the user checked. This is the ONLY channel through which MED, LOW, and INFO
findings enter the queue. This channel only exists when Phase 3 actually
fires (N_HIGH >= 2); it is not available on the N_HIGH == 0 or N_HIGH == 1
degraded paths.
Phase 4 — Resolution UX
Goal: for each finding in the resolution queue, ask the reviewer which
resolution to apply. One resolution per finding (multiSelect: false).
Only findings with at least 2 suggested resolutions receive an
AskUserQuestion question; findings with fewer than 2 suggested resolutions
are auto-skipped (see Per-Finding Skip Rule below).
Batched loop: iterate the Stage-2 queue. For each entry, check its resolution count before issuing a question. Batch qualifying findings (those with >= 2 resolutions) at most 4 per AskUserQuestion call. Sequential calls until all qualifying findings are asked exactly once.
Use AskUserQuestion with header = finding ID and multiSelect: false. Template: see references/report-templates.md §Phase 4.
Rules:
header= finding ID (formatA<n>-<NNN>is 6 chars — no truncation needed).- When the agent marked a Recommended resolution (per finding-schema.md
Recommended prefix rule), the first option's
labelMUST start with"(Recommended)". - Options whose resolution corresponds to an editable-artifact change MUST
include a
previewdiff. Engine-fix-type options (e.g., "file a bug against coordinator.js") MAY omitpreview. - Other free-text is a valid decision — accept it, do not throw or drop it. Record it verbatim in the Phase 5 Decisions table User Note column.
Per-Finding Skip Rule (fr-oi-003-ac6): When Phase 4 iterates to a
finding whose suggested-resolutions count is less than 2 (i.e., 0 or 1
resolution), the skill MUST NOT issue any AskUserQuestion question for it.
AskUserQuestion's options.minItems: 2 constraint forbids a single-option
question, and asking among a single option is pure ceremony. Such findings
rely on their Phase 2 Detail block's Suggested Resolutions table as the
deliverable — the user reads it and decides what to do in the main session.
Note: skipped findings still appear in the Phase 2 Detail block with their
full Resolutions table. The skip only suppresses the interactive question,
not the data surface.
When the Decisions table is rendered (per fr-oi-004), a skipped finding's
row MUST have its Chosen Resolution column set to the sentinel string
(no ceremony — see Detail) and its User Note column left empty.
This skip is NOT an error. Do not treat it as a failure, do not log a warning, and do not halt the Phase 4 loop. Simply proceed to the next finding in the queue.
Phase 5 — Decisions Table (TERMINAL)
Conditional firing gate (fr-oi-004-ac1): Phase 5 renders the Decisions
table ONLY when Phase 3 or Phase 4 actually fired during this invocation.
Use the in-memory ceremony_fired flag that is set to true the moment
either phase issues its first AskUserQuestion call. This single rule
covers every path uniformly: the standard N_HIGH >= 2 path flips the flag
on Phase 3's first call; the N_HIGH == 1 + ≥2-resolutions direct-to-Phase-4
path (fr-oi-002-ac6) flips the flag on Phase 4's first question; the
N_HIGH == 1 + <2-resolutions path enters Phase 4 but issues no question,
so the flag correctly stays false and no Decisions table renders. Do
NOT re-derive this condition at Phase 5 entry — carry the flag forward
from wherever ceremony began.
N_HIGH == 0 path (fr-oi-004-ac4): When both Phase 3 and Phase 4 were
skipped per fr-oi-002-ac5 (the zero-HIGH exit), ceremony_fired remains
false. In that case Phase 5 MUST NOT print a Decisions table, MUST NOT
print a stub "No decisions" line, and MUST NOT produce any Phase 5 output.
The concluding recommendation line already printed at the end of Phase 3's
threshold check is the skill's terminal output on this path.
When ceremony_fired is true, print the Decisions table, then exit.
This is the final deliverable. Template: see references/report-templates.md §Phase 5.
Finding ID: theA<n>-<NNN>id.Chosen Resolution: label of the option the user selected, or "(Other)" when answered via free-text. For findings auto-skipped at Phase 4 (fewer than 2 resolutions), use the sentinel(no ceremony — see Detail).User Note: when the user answered via Other, store that free-text verbatim — no paraphrasing, no summarizing. Empty when no Other text was provided. Empty for auto-skipped findings.
The Decisions table MUST include ALL findings that went through Phase 3
triage or Phase 4 resolution, including any findings auto-skipped at Phase
4 whose rows carry the (no ceremony — see Detail) sentinel. The record
must be complete — no finding that entered the Stage-2 queue MUST be left out.
Phase 5 is TERMINAL. After printing the Decisions table, the skill exits.
The main session MUST NOT apply any resolution via Edit, Write, or any
mutating tool based on user decisions — that action is explicitly out of
scope for this skill. If you find yourself about to call Edit or Write, or
about to invoke /arc-refining to apply changes, STOP — see Red Flags.
--save Flag
When --save is present, the main session writes the full Phase 2 report
- Phase 5 Decisions table to
~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.mdafter Phase 5 prints (24-hour time). Without--save: zero files are written anywhere. Derive<project-hash>via anodesubprocess callinghashRepoPathfrom${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js— never reimplement the hash inline (drift risk). The reference file carries the exact subprocess one-liner, concrete filename example, andmkdir -pparent-directory command.
REQUIRED BACKGROUND: skills/arc-auditing-spec/references/save-flag.md
Hard Boundaries
- Skill body and all sub-agents MUST NOT invoke Edit, Write, or any state-mutating Bash command. This is enforced structurally via the
tools:allowlist in each agent's frontmatter (agents/arc-auditing-spec-*.md) — not via prose in system prompts. See fr-sc-002-ac3. - No git commit, branch creation, worktree creation, or file deletion — under any phase, at any point.
- Phase 5 is terminal. The skill does NOT loop back to "apply" after the user picks resolutions. The Decisions table is the deliverable; downstream action is main session's responsibility and explicitly out of scope (fr-oi-004-ac3).
--saveis the ONLY permitted write, and only to~/.arcforge/reviews/under the arcforge home directory — never intospecs/,docs/, or any project-tracked path.
Red Flags — STOP
If you find yourself doing any of these, STOP immediately:
| Rationalization | Why it's wrong | Do instead |
|---|---|---|
| "The user's spec-id doesn't exist, but this other one is clearly what they meant — I'll audit that and note the substitution at the top" | fr-sc-001-ac2 forbids substitution. This is a baseline failure mode observed in RED testing — the rationalization "don't ask clarifying questions" does NOT justify picking a different spec. |
Print available ids, exit. Let the user re-invoke with the right id. |
"The user said <id> and specs/<id>/ is missing, but docs/plans/<id>/ has a design.md — I'll audit in pure-design mode since the design clearly exists" |
Phase 0 requires specs/<id>/ specifically. docs/plans/<id>/ is not a fallback path — not for pure-design audits, not for partial, not for anything. A design doc without a spec directory is a pre-refining state; the audit skill does not operate on it. |
Print available ids, exit. If the user wanted a design-only review, that's /arc-brainstorming or manual review territory, not this skill. |
| "I spotted an obvious typo while reading — fixing it saves a round trip" | Read-only is absolute. Even typos are reported as findings, never patched. | Add a finding (LOW severity) to the axis agent's output; let main session decide. |
| Any of: • "User picked resolution (a) for A1-003 — I should Edit the spec now" • "Phase 5 ended — I'll invoke /arc-refining to apply the decision" • "User selected (Recommended) — applying saves a round-trip" |
Phase 5 is terminal (fr-oi-004-ac3). No mutation, no auto-chain, no Edit — even for an obvious Recommended choice. This skill's scope ends at the Decisions table. | Print Decisions table, exit. Main session owns all subsequent action. |
| "I'll have arc-refining/arc-planning call this skill at the end of their flow for free quality gating" | fr-sc-001-ac3 forbids pipeline auto-invocation. The skill must remain user-triggered. |
Do not add any invocation from any pipeline SKILL.md body. |
"Let me add a --apply flag so this is one-step for users" |
Makes the skill a mutator, defeating its diagnostic-only contract. | Don't. The contract is the contract. |
"The agent marked option A as (Recommended), so I can lean on it — emphasize it in the question wording, sort it visually first, fold the others under 'more options'" |
(Recommended) is the axis agent's heuristic ranking, not a verdict. Eval evidence (resolution-ranking accuracy ~50% on real specs) shows the prefix is informational, not authoritative. Emphasizing it in presentation, collapsing alternatives, or paraphrasing non-recommended options pre-empts the user's decision and converts an advisory into a soft mutator. |
Render every option with equal visual weight beyond the (Recommended) prefix itself. Do not paraphrase, sort, hide, or deemphasize non-recommended options. The user is the decider. |
| "This MED finding is clearly important so I'll add it to the triage options anyway" | F-01 is pinned: MED/LOW/INFO MUST NOT appear in Phase 3 triage options. The Overview table and Detail block already ensure visibility. |
MED/LOW/INFO are only reachable via the Other free-text channel. Do not add them to options. |
| "The user might miss the MED finding if it's not in the triage options" | Phase 2 overview table and detail block ensure every finding is visible regardless of severity. The triage options array is for HIGH findings only — that is the structural boundary. |
Trust the overview table. Other free-text is the channel for user-initiated MED/LOW/INFO selection. |
"The --save path is pinned to ~/.arcforge/reviews/ but a symlink into docs/reviews/ or specs/ would be more discoverable" |
--save is the ONE carve-out, and only to ~/.arcforge/reviews/ specifically. Any other path violates the Iron Law. |
Write only to ~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.md. |
| "I'll reimplement the project hash inline — it's just sha256 of cwd" | fr-oi-005-ac3: the hash MUST come from ${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js so a single project has one hash across worktree paths and review paths. Reimplementing risks drift. |
Use the subprocess one-liner shown in the --save section. |
"Without --save, I'll save the report anyway — it's harmless and the user will appreciate it" |
fr-oi-005-ac1: without --save, ZERO files are written. Default is read-only. |
Do not write any file unless --save is explicitly present. |
Implementation Delegation
Per fr-sc-003, evals live under skills/arc-auditing-spec/evals/ and exercise each of the three audit axes (fr-sc-003-ac2); the suite MUST pass before shipping.
Cross-References
- Agents:
agents/arc-auditing-spec-cross-artifact-alignment.md,agents/arc-auditing-spec-internal-consistency.md,agents/arc-auditing-spec-state-transition-integrity.md - Spec:
specs/arc-auditing-spec/spec.xml+specs/arc-auditing-spec/details/{skill-contract,audit-agents,output-and-interaction}.xml - Design:
docs/plans/arc-auditing-spec/2026-04-22/design.md·docs/plans/arc-auditing-spec/2026-04-24-iterate2/design.md - Eval scenarios:
skills/arc-auditing-spec/evals/