name: completeness-auditor description: Verify every skill that claims "completed" produced a substantive audit/review artifact on disk with ≥ 5 explicit pass items. Runs as part of the independent audit layer that does not trust any single skill's self-declaration of done. license: MIT
Purpose
Independent completeness auditor. This skill exists because every reviewer / auditor / QA skill in the pipeline can declare "通过" verbally without ever writing the corresponding audit/review file to disk — and once a verbal claim of completion enters the conversation it becomes invisible. This skill makes the audit-file-on-disk a structural requirement of completion.
For every skill in the system that should produce an audit/review/check artifact, this skill verifies that:
- The artifact file exists at its expected path.
- The file is substantive (not a stub, placeholder, or empty template).
- The file lists ≥ 5 explicit pass items — concrete things that were checked and passed, not generic statements like "all good".
- The file lists any failures, warnings, or unresolved issues — and routes each to a repair skill.
If any required audit file is missing, empty, or has fewer than 5 explicit pass items, this skill marks that skill's "completion" as not actually complete and blocks downstream progression.
This skill runs in parallel with consistency-auditor and before quality-assurance-auditor. Together they form the independent audit layer.
When to use
Use this skill:
- Before
quality-assurance-auditor. QA should treat any missing audit artifact as a blocker. - After any reviewer / auditor / robustness-check / final-method-explainer claims completion — to verify the file actually got written.
- When the user asks: "did all the audits actually run?", "is anything missing from the QA chain?", "show me which review files don't exist".
- When the workflow has been running for a while and the user wants a sanity check on the audit chain.
Do NOT use this skill:
- To create the missing audit files. This skill only verifies; the repair routes back to the original skill.
- As a substitute for
consistency-auditor(which checks cross-media facts) orquality-assurance-auditor(which checks paper quality and the three critical rules).
The audit/review artifact registry
This is the canonical list of files that must exist for "completion" to be real. Maintain this list as the project evolves.
Per-subquestion (Qx)
| Producer skill | Required artifact | Min pass items |
|---|---|---|
python-code-reviewer (when Python used) |
code/Qx/reviews/qx_python_review.md |
5 |
matlab-code-reviewer (when MATLAB used) |
code/matlab/Qx/reviews/qx_matlab_review.md |
5 |
robustness-checker |
robustness/Qx/qx_robustness_report.md |
5 |
result-report-generator (round N) |
results/Qx/experiments/roundN/qx_experiment_report_roundN.md |
5 |
result-report-generator (final) |
results/Qx/reports/qx_final_result_analysis.md |
5 |
final-method-explainer |
methods/Qx/qx_final_method_explanation.md |
(structural, not pass-item-based) |
solution-package-builder |
results/Qx/reports/qx_solution_package_for_writer.md + results/Qx/reports/frozen_numbers.json |
(structural) |
figure-table-planner |
methods/Qx/qx_figure_table_plan.md |
(structural) |
method-selector |
methods/Qx/qx_method_candidates.md + methods/Qx/qx_method_iteration_log.md |
(structural) |
modeler-decision-logger |
methods/Qx/qx_decision_log.md |
(decision-record check — see below) |
Human decision artifacts (the enforcement arm for the human gates)
These are the files the human-decision gates (G2.5, G4.5, G4 sign-off) block on. completeness-auditor is what makes those gates real: a missing, PENDING, or under-floor decision artifact is a structural gap caught here, exactly like a missing review file. This is the judgment-gate-checker rule, folded in (no separate skill).
| Gate | Producer skill | Required decision artifact | DECIDED-check |
|---|---|---|---|
| G2.5 | method-selector |
methods/Qx/decisions/method-selector_modeler_decision.md (qx_method_choice) |
yes |
| G4.5 | result-report-generator |
methods/Qx/decisions/result-report-generator_modeler_decision.md (qx_result_verdict) |
yes |
| G4.5 | robustness-checker |
methods/Qx/decisions/robustness-checker_modeler_decision.md (qx_stability_verdict) |
yes |
| G4.5 | final-method-explainer |
methods/Qx/decisions/final-method-explainer_modeler_decision.md (qx_method_explanation) |
yes |
| G4 | solution-package-builder |
methods/Qx/decisions/solution-package-builder_modeler_decision.md (qx_package_signoff) |
yes |
DECIDED-check (mechanical, no NLP — this is what a human gate's pass keys on):
- File exists.
- Front-matter
status: DECIDEDANDdecided_by: human(PENDING/ai/auto⇒ FAIL — treat like a missing artifact). - Every mandatory human field present, over its char floor, no surviving sentinel (
<<<,TODO,TBD,待补充,..., empty). - The
## Modeler's rationalebody is NOT a near-verbatim copy ofai_suggestion(normalized-whitespace equality / tiny edit distance only — fuzzy similarity is a WARN written to the provenance ledger, NEVER a FAIL; formulas andsymbol_table.mdsymbols are exempt). - The rationale cites ≥1 concrete token from
evidence_refs(a number, a candidate id, a symbol). Forqx_stability_verdictandconfidencerationales this is HARD (must cite a number); elsewhere it is a WARN. - Staleness: if a
code/Qx/*file is newer than the decision'sdecided_atand the decision sat behind a now-changed result, flagSTALE — modeler must re-confirm(same mtime rule asfrozen_numbers.json).
A decision artifact failing any of 1–3 is BLOCKING (the gate cannot pass). Item 4 failing is BLOCKING (it's a copy). Items 5–6 produce WARN unless noted HARD. Report each with the exact field, so the modeler knows precisely what to fill.
B-layer sentinel sweep (drafts-for-review skills)
The seven B-layer skills don't emit a front-matter decision artifact; instead they mark each judgment-bearing span with a sentinel the human must replace. A surviving sentinel = the human hasn't done their part = the gate that owns it cannot pass. Sweep these and report any survivor as BLOCKING for its gate:
| Sentinel pattern | Where (load-bearing spans) | Gate it blocks |
|---|---|---|
[MODELER INPUT NEEDED: …] |
problem-parser evaluation_criteria; problem-classifier framing_rationale / modeler_chosen_type |
G1 |
[AI-DRAFT — modeler must confirm: …] |
problem-parser proposed relationships; model-assumptions-builder necessity + impact labels | G1 / G5 |
[MODELER INPUT NEEDED: …] |
paper-section-writer physical-meaning slot + the three seeds (key_result_claim, contribution_claim, why_this_method) |
G5 |
[AI-DRAFT — modeler must confirm: …] / [MODELER INPUT NEEDED: …] |
figure-table-planner Type-3 core_claim; math-figure-generator figure core_claim (blocks promotion to paper/figures/) |
G5 |
The sweep is a plain substring search for [MODELER INPUT NEEDED and [AI-DRAFT — in the finalized artifacts — same mechanism as the <<<HUMAN>>> sentinel check, no NLP. Report each survivor with file:line and the gate it blocks.
Global (project-wide)
| Producer skill | Required artifact | Min pass items |
|---|---|---|
data-auditor-cleaner |
workspace/data_clean/data_report.md |
5 |
symbol-table-builder |
planning/symbol_table.md |
(structural) |
model-assumptions-builder |
planning/model_assumptions.md |
(structural) |
consistency-auditor |
paper/audits/cross_media_consistency_audit.md |
5 |
reference-manager |
paper/audits/reference_audit.md |
5 |
paper-polisher |
paper/audits/polish_report.md |
5 |
quality-assurance-auditor |
paper/qa_report.md |
5 |
Filesystem fallback paths (older conventions still supported)
If the path in the registry above does not exist, fall back to checking these legacy paths and report which convention was used:
workspace/code/code-review/python_review_summary.md(legacypython-code-revieweroutput)workspace/code/code-review/matlab_review_summary.md(legacymatlab-code-revieweroutput)workspace/data/data_report.md(legacy data audit)
Preconditions
This skill can run at any time. It will only audit files that should exist based on workflow state.
To know which artifacts to expect, read:
planning/progress_dashboard.md(if exists) — tells you which subquestions are at which state.- Otherwise, scan the workspace and infer which subquestions have started.
Inputs
planning/progress_dashboard.md(preferred — tells you which Qx claim what state).- File listings under
methods/,code/,results/,robustness/,paper/. - The audit registry above.
Workflow
Determine scope.
- For each subquestion Qx, determine which state it claims to be in (from
progress_dashboard.mdor by scanning the workspace). - For each claimed state, look up which artifacts must exist (from the registry).
- Also enumerate global artifacts.
- For each subquestion Qx, determine which state it claims to be in (from
Verify existence. For each required artifact:
- Does the file exist?
- If yes → proceed to substantive check.
- If no → mark MISSING; route back to the producer skill.
Verify substantive content. For files that should have ≥ 5 explicit pass items:
- Open the file.
- Count distinct explicit pass items. An "explicit pass item" is a checkmarked or numbered statement of the form "✅/✓/[x]
was checked and passed because ". - Generic statements like "all checks passed" or "looks good" do NOT count as a pass item.
- If fewer than 5 explicit pass items → mark INSUFFICIENT; route back to the producer skill.
For files that are structural (no pass-item requirement):
- Verify the structural sections required by that skill's output format are present and non-empty.
- If a section is empty or contains only a placeholder ("TBD", "TODO", "待补充", "..."), mark INSUFFICIENT.
Verify recency (staleness check).
- If
frozen_numbers.jsonexists, compare itsfrozen_attimestamp to the most recent modification time of anycode/Qx/source file. - If code is newer than the frozen snapshot, mark the snapshot as STALE — recommend re-freezing via
solution-package-builder. - Same check applies to audit files: if a
*_review.mdis older than the code it reviewed, mark STALE.
- If
Produce the completeness report. Save to
paper/audits/completeness_audit.md.Route repairs. Every MISSING / INSUFFICIENT / STALE entry carries the producer skill name in a
repair_skillfield. Hand the report toworkflow-orchestratorfor routing.
Outputs
Produce one file:
paper/audits/completeness_audit.md
Output format
# Completeness Audit Report
> **Status**: [PASSED / FAILED]
> **Date**: [timestamp]
> **Scope**: [Q1, Q2, Q3, Q4 + global]
## Summary
| Producer Skill | Required Artifact | Status | Pass Items | Notes |
|---------------|-------------------|--------|------------|-------|
| python-code-reviewer | code/Q1/reviews/q1_python_review.md | ✅ OK | 7 | — |
| matlab-code-reviewer | code/matlab/Q2/reviews/q2_matlab_review.md | ❌ MISSING | — | re-run matlab-code-reviewer for Q2 |
| robustness-checker | robustness/Q1/q1_robustness_report.md | ⚠ INSUFFICIENT | 2 | only 2 explicit pass items; need ≥ 5 |
| solution-package-builder | results/Q3/reports/frozen_numbers.json | ⚠ STALE | n/a | frozen 2026-05-10; code modified 2026-05-15 |
| consistency-auditor | paper/audits/cross_media_consistency_audit.md | ✅ OK | 8 | — |
| quality-assurance-auditor | paper/qa_report.md | ❌ MISSING | — | run after consistency + completeness pass |
## Pass Items (≥ 5 required for the audit to count as run)
1. ✅ `code/Q1/reviews/q1_python_review.md` exists with 7 explicit pass items.
2. ✅ `methods/Q1/q1_final_method_explanation.md` has all required structural sections filled.
3. ✅ `planning/symbol_table.md` is non-empty and contains 23 symbol definitions.
4. ✅ `results/Q1/reports/frozen_numbers.json` exists and is newer than all Q1 source files (not stale).
5. ✅ `workspace/data_clean/data_report.md` has 6 explicit pass items.
6. ✅ ...
## Missing Artifacts
| # | Skill | Expected Path | Repair Skill | Severity |
|---|-------|--------------|--------------|----------|
| 1 | matlab-code-reviewer | code/matlab/Q2/reviews/q2_matlab_review.md | matlab-code-reviewer | BLOCKING |
| 2 | quality-assurance-auditor | paper/qa_report.md | quality-assurance-auditor | BLOCKING (run last) |
## Insufficient Artifacts (fewer than required pass items)
| # | File | Pass Items Found | Required | Repair Skill |
|---|------|------------------|----------|--------------|
| 1 | robustness/Q1/q1_robustness_report.md | 2 | 5 | robustness-checker |
## Stale Artifacts
| # | File | Frozen / Reviewed At | Source Modified At | Repair Skill |
|---|------|---------------------|-------------------|--------------|
| 1 | results/Q3/reports/frozen_numbers.json | 2026-05-10 14:22 | 2026-05-15 09:11 | solution-package-builder |
## Verdict
- **Audit layer ready for QA**: [yes / no]
- **Final assembly allowed**: [no — QA has not run yet]
- **Recommended next skill**: [first missing/insufficient producer skill in repair order]
## Repair order (suggested)
1. matlab-code-reviewer → produce `code/matlab/Q2/reviews/q2_matlab_review.md`.
2. robustness-checker → expand Q1 robustness report to ≥ 5 pass items.
3. solution-package-builder → re-freeze Q3 numbers after confirming code changes.
4. quality-assurance-auditor → run after the above pass.
Rules
- Do NOT create any missing audit file. Repair routes back to the producer skill.
- Do NOT count generic statements ("all good", "passed") as pass items. Only count concrete, evidence-bearing statements.
- Do NOT skip a check because the producer skill claimed completion verbally. Verbal claims do not count; only files on disk count.
- ALWAYS list ≥ 5 explicit pass items in your own report. If you cannot, status is
not_run. - For staleness, the rule is: a review/freeze file must be newer than every source file it reviewed. Off by one second is fine; older by an hour is stale.
- If a producer skill is not applicable (e.g., no MATLAB used, so
matlab-code-reviewerhas nothing to review), mark itnot_applicable— do not mark it MISSING. - The audit registry is the contract. If a new producer skill is added to the system, update the registry in this file FIRST, then run.
Verification
Before returning the report, verify:
- The pass list has ≥ 5 explicit items.
- Every entry in the summary table has a status, producer skill name, and (if not OK) a repair skill.
- The verdict matches the entries — cannot be PASSED if any entry is BLOCKING.
- The report is saved to
paper/audits/completeness_audit.md(create dir if missing).
Failure modes
Stop and report a blocker if:
- The user asks to mark an artifact as "complete" by writing the audit file directly (not the producer skill's job).
- The user asks to bypass an INSUFFICIENT entry by lowering the pass-item threshold case-by-case. The threshold is structural; if a particular check legitimately has fewer than 5 items, the producer skill should justify and document that — and this auditor should still flag it.
Stop conditions
This skill must stop instead of guessing when:
- The workspace state is so unclear that we cannot tell which Qx are in progress.
- The producer of a missing artifact cannot be determined (e.g., a file appeared on disk but doesn't match the registry — flag it as orphaned).
When stopping, output:
- the blocker
- the affected artifact
- recommended next skill
Handoff
After producing the completeness report:
- If the verdict is PASSED → hand off to
quality-assurance-auditor. - If anything is MISSING / INSUFFICIENT / STALE → hand off to
workflow-orchestratorwith the repair list.
Relationship to the audit layer
This skill is one of three independent auditors. They run in this order before final assembly:
consistency-auditor— cross-media facts match (numbers, files, symbols, parameters).completeness-auditor(this skill) — every claimed completion has a substantive artifact on disk.quality-assurance-auditor— workflow completeness, three critical rules, anti-fabrication, paper quality.
Each is independent; each can fail. Any failure blocks final assembly.
Examples
Example 1: Everything OK
Input: progress dashboard says Q1 at state 09 (Ready for Writer); all artifacts on disk; every review file has ≥ 5 pass items; nothing stale.
Output: 8 pass items listed; 0 missing; 0 insufficient; verdict PASSED → quality-assurance-auditor.
Example 2: Reviewer file missing
Input: code/Q2/q2_main.py exists; modeler said "Python code passed review"; but code/Q2/reviews/q2_python_review.md does not exist on disk.
Output:
| 1 | python-code-reviewer | code/Q2/reviews/q2_python_review.md | python-code-reviewer | BLOCKING |
Verdict: FAILED. The producer skill must actually write the review file — verbal "passed" doesn't count.
Example 3: Pass items too thin
Input: robustness/Q1/q1_robustness_report.md exists but only says "robustness OK".
Output:
| 1 | robustness/Q1/q1_robustness_report.md | 0 (only "robustness OK", not concrete) | 5 | robustness-checker |
Repair: robustness-checker must re-run and list ≥ 5 specific checks that passed (e.g., "weight ±10% — top 3 ranking unchanged", "removed indicator X — TOPSIS score shifted by 0.02 only", etc.).
Example 4: Stale frozen snapshot
Input: frozen_numbers.json was created on 2026-05-10; code/Q3/q3_main.py was modified on 2026-05-15 after a bug fix.
Output:
| 1 | results/Q3/reports/frozen_numbers.json | 2026-05-10 14:22 | 2026-05-15 09:11 | solution-package-builder |
Repair: solution-package-builder must re-freeze after the modeler confirms the new numbers are canonical. Also: consistency-auditor will likely flag the paper as drifted — both audits should run in sequence.
Example 5: Not applicable
Input: project uses Python only; no .m files anywhere.
Output:
| matlab-code-reviewer | n/a | not_applicable | — | no MATLAB files in project |
Not counted as MISSING.