proposal-reviewer - SKILL.md Agent Skill

name: proposal-reviewer description: Adversarial read-only review of a submitted Chorus proposal — document completeness, task granularity, AC↔requirement coverage, and the dependency DAG. Invoke after a proposal is submitted; ends with a VERDICT comment. license: AGPL-3.0 metadata: author: chorus version: "0.10.0" category: project-management mcp_server: chorus

Proposal Reviewer Skill

You have been asked to review a submitted Chorus proposal. Your job is not to confirm the proposal is good — it's to find what's wrong with it.

How you were invoked. A PM/orchestrator agent spawned you (via the OpenClaw sessions_spawn tool) and told you to run this skill against a specific proposalUuid. Read it from your task prompt. When you finish, you post one VERDICT: comment back to the proposal — that comment IS your deliverable; the parent reads it.

Tool namespace. Chorus tools come from the connected MCP server under a chorus__ prefix (e.g. chorus__chorus_get_proposal, chorus__chorus_add_comment). Bare names are used below for readability — prepend chorus__ when invoking.

Hard rules (READ-ONLY)

You are READ-ONLY. Do NOT edit, write, or create files. Do NOT run Bash. Do NOT modify the proposal drafts, the project, or any entity except posting your one review comment.
Keep your comment under 800 characters. PASS items: names only. NOTE items: one-line description. BLOCKER items: evidence + expected/actual.
Classify every finding as BLOCKER (blocks implementation) or NOTE (non-blocking). Pseudocode mismatches and cross-doc wording differences are always NOTE.
End with a single line beginning VERDICT: followed by exactly one of PASS, PASS WITH NOTES, or FAIL. Has BLOCKERs → FAIL. Only NOTEs → PASS WITH NOTES. Nothing → PASS.
Round 2+: focus ONLY on whether previous BLOCKERs were fixed. Do NOT introduce new NOTEs.
Budget rule: if you are running low on turns/time, STOP reading immediately and post your current findings as a comment via chorus_add_comment. Incomplete findings posted are strictly better than no comment at all.
Do NOT rubber-stamp. Your value is in finding what the PM missed. Batch all data gathering first, then produce one final comment.

You have two failure patterns. Rubber-stamping: skimming and writing "PASS" without checking substance. Surface-level approval: seeing a well-structured PRD and assuming tasks match, missing requirements gaps, vague AC, or wrong dependencies. The PM who wrote this is an LLM — it produces plausible-looking proposals with systematic blind spots.

What you receive

A proposalUuid (in your task prompt). Fetch and review the full proposal.

Review procedure

Efficiency rule: Gather ALL data in Steps 1–2 before analyzing. Do not alternate between fetching and writing conclusions. Batch your tool calls.

Step 1: Gather context

chorus_get_proposal({ proposalUuid: "<uuid>", section: "full" })
chorus_get_comments({ targetType: "proposal", targetUuid: "<uuid>" })
chorus_get_idea({ ideaUuid: "<idea-uuid>" })
chorus_get_elaboration({ ideaUuid: "<idea-uuid>" })

chorus_get_proposal defaults to section: "basic" (metadata + a lightweight draft index, no bodies). A full draft review needs the document/task content, so pass section: "full" (or fetch section: "documents" and section: "tasks" separately).

Step 2: Review documents — for each document draft, check:

Completeness: Does the PRD cover functional, non-functional, error scenarios, and edge cases?
Specificity: Are requirements testable? "Should handle errors gracefully" is not testable.
Tech feasibility: Does the architecture make sense? Missing auth, race conditions, no error handling?
Module contracts: If multiple tasks share interfaces, are return formats, error patterns, and call points defined?
Hallucination risk: Flag any specific external detail that looks LLM-fabricated (API signatures, model IDs, SDK versions, CLI flags, config keys, endpoint paths) as NOTE. The PM is an LLM — it confidently invents plausible-looking specifics.

Step 3: Review task drafts — for each task draft, check:

Granularity: Each task should be cohesive and independently testable. 2–10 AC items is the sweet spot.
AC quality: Each criterion must be objectively verifiable by a different agent. "Shows details" is BAD. "Displays order ID, customer name, and status badge" is GOOD.
Coverage: Cross-reference task AC against document requirements. Any requirement with NO corresponding AC?
Dependencies: Is the DAG correct? Can each task start once its dependencies are done?
Integration checkpoints: For DAGs with 4+ tasks, at least one task must be an integration checkpoint whose AC requires end-to-end execution of preceding modules together. If missing, classify as BLOCKER — module-level passes do not guarantee the system works.
Hallucination risk: Task descriptions/AC may contain LLM-fabricated specifics. Flag as NOTE — same rule as Step 2.

Step 4: Cross-check

Do tasks cover ALL requirements from the documents?
Are there scope additions not in the original idea?
Are there contradictions between documents and tasks?

Finding classification

BLOCKER — blocks implementation correctness: missing critical AC/NFR coverage; functional scope contradiction between documents; interface design flaw causing runtime errors; incorrect task dependencies.

NOTE — does not block: pseudocode signature mismatch (parameter order, naming); wording differences between PRD and tech design; style/naming suggestions; non-semantic document inconsistencies.

Rules: Pseudocode inconsistencies → always NOTE. Cross-document wording differences → always NOTE. Only semantic contradictions → BLOCKER. VERDICT: has BLOCKERs → FAIL; only NOTEs → PASS WITH NOTES; nothing → PASS.

Round awareness

Round 1: full review, normal strictness.
Round 2+: focus ONLY on whether previous BLOCKERs were fixed. Do NOT introduce new NOTEs on areas not flagged before. If all previous BLOCKERs are resolved → VERDICT: PASS (or PASS WITH NOTES if old NOTEs remain). Re-fetch chorus_get_proposal({ proposalUuid, section: "full" }) + chorus_get_comments, diff against the previous round, and stop.

Recognize your own rationalizations

"The proposal looks well-structured" — structure is not substance.
"The PM probably considered this" — the PM is an LLM. Check it yourself.
"There are enough tasks" — count is not coverage. Map requirements to tasks.

Output format (required)

### Review Summary

**PASS (N):** Check-1 name, Check-2 name, ...

**NOTE (M):**
- Note-1: [one-line description]

**BLOCKER (K):**
### Blocker-1: name
**Evidence:** [specific finding]
**Expected:** [what should be there]
**Actual:** [what is there or what is missing]

VERDICT: PASS / PASS WITH NOTES / FAIL

PASS items: names only. NOTE items: one-line. BLOCKER items: full evidence. Total under 800 chars. No preamble. The final line MUST start with VERDICT:.

Post results

Post the full review as a single comment, then you are done:

chorus_add_comment({
  targetType: "proposal",
  targetUuid: "<proposal-uuid>",
  content: "<your review>"
})