review-panel - SKILL.md Agent Skill

name: review-panel description: "Assembles a review panel of specialist agents matched to the code or artifact being reviewed, each evaluating against domain-specific rubrics, then synthesizes findings into a consolidated report. Called by review-loop for /legion:review. Use when the user needs multi-reviewer feedback, peer review simulation, rubric-based code evaluation, or expert critique from multiple perspectives." triggers: [review, panel, expert, opinion, advisory, evaluate, peer review, code review, multi-reviewer, critique, rubric evaluation, get feedback] token_cost: medium summary: "Assembles specialist review panels from the agent pool, each reviewer evaluating against domain-specific rubrics. Produces a synthesized consolidated report. Called by review-loop for /legion:review."

Review Panel

Composes context-aware multi-perspective review teams from the 48-agent pool (roster defined in CLAUDE.md Division table; count computed at runtime from agents/ directory). Each reviewer evaluates through domain-specific weighted rubrics with non-overlapping criteria. Produces a synthesized consolidated report.

Used by /legion:review when panel mode is selected. Replaces the static phase-type-to-agent mapping with dynamic selection via agent-registry recommendation algorithm.

Quick Start

Three phases: Compose (Section 1) → Review (agents evaluate independently against rubrics from Section 2) → Synthesize (Section 3 deduplicates, filters, and produces consolidated report). Intent filtering narrows scope when flags like --just-security are passed.

Section 1: Panel Composition Algorithm

How to assemble a review panel dynamically based on what's being reviewed.

1.2 Intent-Based Panel Filtering

When review command specifies intent (e.g., --just-security):

Load Intent Domains
- From intent-teams.yaml security-only template
- Domains: security, owasp, stride, authentication, authorization
Filter Agents by Domain
- For each candidate agent:
  - Check agent's primary domains in authority-matrix.yaml
  - Include only if domains overlap with intent domains
- Example: engineering-security-engineer (security domain) → INCLUDE
- Example: engineering-frontend-developer (ui domain) → EXCLUDE
Override Recommendations
- If intent mode: Use intent template agents, skip registry recommendation
- If full mode: Use existing recommendation algorithm
Domain-Only Review Scope
- Agents review ONLY their intent-matching domains
- Non-security findings deferred or excluded
- Prevents out-of-domain opinions (already filtered by authority in Phase 37)

Input: Phase CONTEXT.md content, SUMMARY.md content, files_modified list
Output: Ordered list of 2-4 reviewer agents with assigned rubrics

Step 1: Extract review signals from phase artifacts
  Read the phase CONTEXT.md and all SUMMARY.md files. Extract:
  - Primary domains touched (engineering, design, marketing, testing, product, infrastructure)
  - File types produced (.md, .ts, .js, .css, .py, config files, etc.)
  - Keywords from phase goal and task descriptions
  - Combine into a composite task description for the recommendation algorithm

Step 1.0: Preconditions Verification (MANDATORY before composition)

  Before running the scoring and filtering steps below, verify all preconditions. Missing
  preconditions trigger documented degraded behavior, not silent success.

  a) **agent-registry.md Section 3 structure** — read `skills/agent-registry/SKILL.md`.
     Verify Sections 1 through 6 exist as named headings. If Section 6 (Memory Boost) is
     absent: **skip the memory boost step below** and log WARN:
     `"agent-registry Section 6 not found. Skipping memory-based agent score boost."`
     Do not fail the review — proceed without the boost.

  b) **OUTCOMES.md schema** — if `.planning/memory/OUTCOMES.md` exists, verify it conforms
     to the memory-manager schema (required front-matter keys: `task_type`, `outcome`,
     `agent_id`). If malformed: skip memory boost with WARN; do not fail.

  c) **Agent frontmatter — `division` field** — every candidate agent .md file must have a
     `division` key in its frontmatter. If absent on any candidate: log WARN and exclude
     that candidate from the panel. Do not guess the division from the filename prefix.

  d) **Review-capability declaration** — an agent is eligible for the panel iff its
     frontmatter has a non-empty `review_strengths` array. This is the canonical marker;
     do NOT infer eligibility from specialty substring matches. See Step 3 below.

Step 2: Score agents using agent-registry Section 3
  Pass the composite task description to the agent-registry recommendation algorithm:
  - Step 1: Parse extracted keywords as task terms
  - Step 2: Match agents — exact (3 pts), partial (1 pt), division (2 pts)
  - Step 3: Rank by score, break ties by specificity
  - Step 6 (optional, per precondition 1.0.a): Apply memory boost if OUTCOMES.md exists
    AND agent-registry Section 6 is present AND OUTCOMES.md is schema-conformant.

Step 3: Filter to review-capable agents
  From the ranked list, keep only agents whose frontmatter declares a non-empty
  `review_strengths` array (the declarative eligibility marker established in Step 1.0.d).

  **Reference roster — current review-capable agents** (verify against agents/ directory
  at runtime; this list is informational, not authoritative):

  Testing division (6 agents — verified against CLAUDE.md):
  - testing-qa-verification-specialist, testing-api-tester
  - testing-workflow-optimizer, testing-performance-benchmarker
  - testing-test-results-analyzer, testing-tool-evaluator

  Design division (review-capable):
  - design-brand-guardian, design-ux-architect, design-ux-researcher

  Engineering division (review-capable):
  - engineering-senior-developer, engineering-backend-architect
  - engineering-frontend-developer, engineering-infrastructure-devops

  Product division (review-capable):
  - product-sprint-prioritizer, product-feedback-synthesizer

  Project Management (review-capable):
  - project-manager-senior, project-management-project-shepherd

  **Note:** `testing-evidence-collector`, `engineering-devops-automator`, and
  `marketing-content-creator` do NOT exist in the Legion 48-agent roster. They MUST NOT
  appear in this list. If found in a prior version of this skill file, treat as a doc bug
  and verify removal. Canonical agent IDs are in CLAUDE.md Division table.

  If an agent from a non-review-capable role (e.g., a marketing-platform specialist without
  review_strengths, or an XR-immersive role) scores highly, skip it and take the next
  eligible agent.

Step 4: Cap panel size and enforce diversity
  - 2 reviewers: single-domain phase (only one division touched)
  - 3 reviewers: standard phase (2 divisions involved)
  - 4 reviewers: cross-domain phase (3+ divisions involved)

  Diversity rule: no more than 2 reviewers from the same division.
  If 3+ agents from Testing score highest, keep the top 2 and pull the
  next-highest from a different division.

  Mandatory: at least one Testing division agent on every panel.
  This inherits from agent-registry Step 5 (Mandatory Roles) and from
  the existing review-loop principle that testing-qa-verification-specialist is
  always included.

Step 5: Assign rubrics
  For each selected reviewer, look up their domain rubric from Section 2.
  The rubric is keyed by agent ID. If no specific rubric exists for an agent,
  use the division default rubric.

Step 6: Present panel to user for confirmation
  Display the composed panel via AskUserQuestion:

  ## Phase {N}: {phase_name} — Review Panel

  **Panel Size**: {count} reviewers (based on {domain_count} domain(s) detected)
  **Domains Detected**: {comma-separated domains}

  | # | Agent | Division | Rubric Focus | Score |
  |---|-------|----------|--------------|-------|
  | 1 | {agent-id} | {division} | {rubric_name} | {score} pts |
  | 2 | {agent-id} | {division} | {rubric_name} | {score} pts |

  **Why this panel**: {1-sentence rationale linking detected domains to selected agents}

  Options:
  - "Use this panel" (Recommended)
  - "Add a reviewer" — add one more agent from the ranked list (up to max 4)
  - "Replace a reviewer" — swap one reviewer for an alternative
  - "Other" — enter custom agent IDs

  If user selects "Add a reviewer": show next-ranked eligible agent, confirm
  If user selects "Replace a reviewer": show which reviewer to replace and alternatives
  If user selects "Other": accept custom agent IDs, validate each exists, assign default rubric

Panel Output Contract (consumed by review-loop dispatch)

review-panel does NOT spawn agents. It emits a composition artifact that review-loop Section 2 consumes and review-loop Section 3/4 dispatches. The contract below is the ONLY surface review-loop reads.

Emitted artifact: in-memory object with shape:

panel:
  reviewers:          # ordered list, length 2-4 per Step 4
    - agent_id: {string — must match filename in agents/}
      division: {string}
      rubric_name: {string — matches Section 2 rubric key}
      score: {integer}
      rationale: {string}
  domains_detected: [{string}, ...]
  mode: "full" | "security-only" | "design-only" | "marketing-only"
  confirmed_by_user: true

Dispatch handoff to review-loop:

Field	Value
When	AFTER Step 6 user confirmation succeeds (`confirmed_by_user: true`). If user declines all options: emit `<escalation severity=warning type=scope>` and return empty panel — do NOT fabricate a panel.
Why no dispatch here	review-panel is pure composition. Spawning is the responsibility of review-loop Section 3 Step 4 (review agents) and Section 5 Step 4 (fix agents). Separation prevents double-dispatch bugs when panel is reused across cycles.
How many reviewers	Exactly the count emitted in `panel.reviewers` (2-4 per Step 4 cap). review-loop must not add, drop, or reorder reviewers without re-invoking review-panel.
Mechanism	Return value to review-loop caller. review-loop reads `panel.reviewers[*].agent_id` and treats each as input to Section 3 prompt construction, then Section 3 Step 4 dispatch spec.

Preconditions (verify before emitting artifact):

Step 1.0 preconditions passed.
Step 3 review-capability filter produced ≥ 1 eligible agent.
Step 4 cap produced count(reviewers) ∈ {2, 3, 4}.
Step 5 rubric assignment succeeded for every reviewer (no silent default).
Step 6 user confirmation returned a closed-set option (never free text without validation).

If any precondition fails: emit <escalation severity=blocker type=quality> and return empty panel. Do NOT return partial panel.

Review Conduct Rules

These rules are injected into every review agent's prompt to ensure rigorous, actionable feedback.

Mandatory Conduct

Verify before implementing — Confirm feedback is technically correct for THIS codebase before acting on it. Grep for the function, read the file, check the dependency.
Pushback is expected — Reviewers MUST challenge when: suggestion would break existing functionality, reviewer lacks context for the change, recommendation violates YAGNI, feedback is technically incorrect for the runtime/framework in use, or suggestion conflicts with prior architectural decisions.
No performative agreement — Do NOT use phrases like "great point!", "you're absolutely right!", "excellent suggestion!". State agreement factually: "Confirmed: the null check is missing."
Specificity required — Every finding MUST include: file path and line number, what is wrong, why it matters, and how to fix it. Findings without all four elements are rejected.
Severity accuracy — Do NOT mark nitpicks (formatting, naming preferences) as BLOCKER. BLOCKER is reserved for: crashes, data loss, security vulnerabilities, broken builds. WARNING is for logic errors, missing edge cases, performance issues. INFO is for style, conventions, suggestions.
Clear verdict mandatory — Every review MUST end with an explicit verdict: "Ready to merge? Yes / No / With fixes". No ambiguous conclusions.

Section 2: Domain Rubric Registry

Non-overlapping evaluation criteria for each reviewer specialty. Each rubric defines what that reviewer checks — and implicitly, what they do NOT check (that's another reviewer's rubric).

How Rubrics Work

A rubric is a set of 3-5 evaluation criteria injected into the review prompt alongside the reviewer's personality. Each criterion has a name and description. The reviewer evaluates ONLY against their assigned criteria, producing findings scoped to their domain lens.

Rubrics are injected into the review prompt as an additional section AFTER the standard review instructions from review-loop Section 3 and BEFORE the "Required Feedback Format" section:

## Your Domain Rubric — {rubric_name}

Evaluate ONLY against these criteria. Other aspects are covered by fellow panel reviewers.

| # | Criterion | What to Check |
|---|-----------|---------------|
| 1 | {name}    | {description} |
| 2 | {name}    | {description} |
...

For each finding, tag it with the criterion number: "**Criterion**: {N} — {criterion_name}"

## Confidence Requirement

For EVERY finding, you MUST rate your confidence:
- **HIGH (80-100%)**: Certain this is a real issue based on evidence in the code/content
- **MEDIUM (50-79%)**: Suspect an issue but can't fully confirm — flag it but it may be deferred
- **LOW (<50%)**: Uncertain — do NOT report this finding

Only HIGH-confidence findings are actioned. Rate conservatively — a false positive wastes more time than a missed MEDIUM finding that surfaces in the next review.

Include in each finding: `- **Confidence**: {HIGH | MEDIUM | LOW} — {percentage}%`

Rubric Definitions

Testing Division

testing-qa-verification-specialist — Production Readiness

#	Criterion	What to Check
1	Error handling	Edge cases covered, failures degrade gracefully, no silent swallowing
2	Real-world usage	Works outside happy path — unexpected input, missing state, concurrent access
3	Stability	No crashes, hangs, or resource leaks under normal operation
4	Integration correctness	Cross-file references resolve, dependencies exist, imports work

testing-test-results-analyzer — Verification & Test Quality

#	Criterion	What to Check
1	Proof artifacts	Test files, verification scripts, or documented test runs exist for claims
2	Coverage breadth	All success criteria have corresponding verification, not just some
3	Assertion quality	Tests check meaningful behavior, not just "no crash"; before/after documented
4	Reproducibility	Verification steps can be repeated by someone else with same results

testing-api-tester — API Contract Compliance

#	Criterion	What to Check
1	Endpoint correctness	Routes, methods, status codes match specification
2	Request/response validation	Payloads conform to schema, required fields present, types correct
3	Security boundaries	Auth required where expected, no unprotected sensitive endpoints
4	Error responses	API returns structured errors, not stack traces or generic 500s

testing-workflow-optimizer — Process Efficiency

#	Criterion	What to Check
1	Workflow correctness	Steps execute in right order, no dead ends or unreachable paths
2	Redundancy	No duplicated logic, unnecessary steps, or circular dependencies
3	Automation opportunity	Manual steps that could be automated are flagged
4	Handoff clarity	Inputs and outputs between workflow steps are well-defined

testing-performance-benchmarker — Performance Characteristics

#	Criterion	What to Check
1	Resource efficiency	No unnecessary file reads, API calls, or memory allocation
2	Scalability indicators	Approach handles growth (more agents, larger files, more phases)
3	Bottleneck risk	Sequential operations that could be parallelized
4	Cost awareness	Agent spawn count, model tier usage aligned with cost profile convention

testing-tool-evaluator — Dependency & Tool Health

#	Criterion	What to Check
1	Dependency health	Referenced tools, libraries, or services are current and maintained
2	Compatibility	Dependencies work together, no version conflicts or breaking changes
3	License compliance	No restrictive licenses that conflict with project distribution
4	Alternatives considered	Better tools exist for the job and weren't evaluated

Design Division

design-brand-guardian — Brand Consistency

#	Criterion	What to Check
1	Visual identity	Colors, typography, spacing follow brand guidelines
2	Voice and tone	Copy and messaging match brand personality
3	Component consistency	UI elements use shared design tokens, not ad-hoc values

design-ux-architect — Accessibility & Structure

#	Criterion	What to Check
1	WCAG compliance	Contrast ratios, keyboard navigation, screen reader support
2	Information architecture	Content is logically organized, navigation is intuitive
3	Semantic structure	HTML is semantic, headings are hierarchical, landmarks are correct

design-ux-researcher — Usability

#	Criterion	What to Check
1	Nielsen's heuristics	Visibility of status, user control, consistency, error prevention
2	User flow completeness	Can users complete the intended task without getting stuck?
3	Cognitive load	Interface doesn't overwhelm — progressive disclosure, clear hierarchy

Engineering Division

engineering-senior-developer — Code Architecture

#	Criterion	What to Check
1	Pattern consistency	New code follows established project patterns and conventions
2	Abstraction quality	Right level of abstraction — not over-engineered, not duplicated
3	Maintainability	Code is readable, changes are localized, dependencies are explicit
4	Tech debt	New code doesn't introduce unnecessary technical debt

engineering-backend-architect — Backend Design

#	Criterion	What to Check
1	Data modeling	Data structures are appropriate, relationships are clear
2	API design	Endpoints are RESTful/consistent, versioning considered
3	Infrastructure patterns	Deployment, scaling, and monitoring are addressed

engineering-frontend-developer — Frontend Quality

#	Criterion	What to Check
1	Component structure	Components are reusable, props are typed, state is minimal
2	Rendering correctness	No unnecessary re-renders, loading states handled, errors caught
3	Responsive design	Layout works across viewport sizes, touch targets are adequate

engineering-infrastructure-devops — Operational Readiness

#	Criterion	What to Check
1	CI/CD integration	Changes are testable in pipeline, no manual deployment steps
2	Configuration management	Secrets are not hardcoded, env-specific config is externalized
3	Monitoring & logging	Errors are observable, metrics are available for alerting

Product Division

product-sprint-prioritizer — Prioritization Alignment

#	Criterion	What to Check
1	Business value	Implementation delivers the intended user/business value
2	Scope discipline	No scope creep beyond what was planned
3	Dependency management	Cross-phase and external dependencies are tracked and resolved

product-feedback-synthesizer — User Alignment

#	Criterion	What to Check
1	User need coverage	Implementation addresses the core user pain point or request
2	Feedback incorporation	Known user feedback was considered in the implementation
3	Satisfaction drivers	The output is likely to improve user satisfaction

Project Management Division

project-manager-senior — Delivery Management

#	Criterion	What to Check
1	Scope completeness	All planned deliverables are present
2	Risk mitigation	Known risks were addressed or documented
3	Documentation	Handoff documentation is sufficient for the next phase

project-management-project-shepherd — Process Compliance

#	Criterion	What to Check
1	Methodology adherence	Work followed the established workflow (plan → build → review)
2	Artifact quality	State files, summaries, and plans are complete and accurate
3	Handoff readiness	Next phase can start without ambiguity about what was done

Division Default Rubrics

For agents not listed above, use the default rubric for their division:

Division	Default Rubric	Criteria
Testing	General QA	Correctness, completeness, consistency
Design	General Design Review	Visual quality, usability, accessibility
Engineering	General Code Review	Correctness, patterns, maintainability
Product	General Product Review	Value delivery, scope, alignment
Project Management	General Delivery Review	Completeness, documentation, handoff

Section 3: Panel Result Synthesis

How to consolidate findings from multiple panel reviewers into a unified report.

After all panel reviewers submit findings (per adapter.collect_results):

Step 1: Collect and parse
  Same as review-loop Section 4 (Feedback Collection):
  - Parse Finding blocks from each reviewer
  - Record for each finding:
    - Finding number
    - File path
    - Line/section reference
    - Severity (BLOCKER, WARNING, or SUGGESTION)
    - Confidence (HIGH, MEDIUM, or LOW with percentage)
    - Issue (one-sentence description)
    - Suggested fix
    - Reviewer agent ID
    - Criterion tag (from rubric)

Step 2: Deduplicate findings by location and severity

Location-based deduplication:
1. Parse location from each finding:
   - Format: "{file_path}:{line_number}" or "{file_path}:{start_line}-{end_line}"
   - If no line number: use file_path only
   
2. Group findings by normalized location:
   - Normalize: resolve relative paths, lowercase on case-insensitive FS
   - Group key: "{normalized_path}:{line}"
   
3. For each location group with multiple findings:
   a. Keep the finding with HIGHEST severity:
      Priority: BLOCKER > WARNING > SUGGESTION
   b. If same severity: keep HIGHEST confidence
   c. If same severity and confidence: merge descriptions
   d. Tag merged finding: "Consolidated from {N} reviewers"

Severity escalation rules:
- Same issue, different severity from reviewers → escalate to highest
- Example: Reviewer A says WARNING, Reviewer B says BLOCKER → result: BLOCKER
- Reason: Err on side of caution — if any reviewer considers it blocking, it is

Line range overlap detection:
- Finding 1: src/auth.ts:45-52
- Finding 2: src/auth.ts:50-60
- Overlap at lines 50-52 → treat as same location, merge

Preservation rules:
- ALWAYS preserve at least one finding per location
- NEVER discard all findings for a location
- When merging: concatenate reviewer IDs, keep all suggested fixes as options

Step 2.5: Generate deduplication report
  After deduplication, report statistics:
  
  ### Deduplication Summary
  | Metric | Count |
  |--------|-------|
  | Raw findings (before dedup) | {N} |
  | Unique locations | {M} |
  | Findings merged | {N - M} |
  | Severity escalations | {K} |
  
  ### Merged Findings
  | Location | Original Severity | Final Severity | Reviewers |
  |----------|-------------------|----------------|-----------|
  | src/x.ts:45 | WARNING → BLOCKER | BLOCKER | A, B |
  | src/y.ts:30 | SUGGESTION | SUGGESTION | C, D |

Step 2.6: Filter by confidence
  - HIGH-confidence findings (80%+): include in synthesis
  - MEDIUM-confidence findings (50-79%): collect into "Deferred" section
  - LOW-confidence findings: discard
  - When deduplicating findings with different confidence levels from different
    reviewers, keep the HIGHER confidence rating (if one reviewer is HIGH and
    another is MEDIUM on the same finding, it's HIGH)

Step 3: Filter out-of-domain critiques

Input: Findings[] (after deduplication), Active panel agents[]
Output: Filtered findings[] with out-of-domain critiques removed

Algorithm:
1. Build domain ownership map from active panel agents:
   - For each agent in panel:
     - Load exclusive_domains from authority matrix
     - Create mapping: domain → owning_agent_id

2. Detect domain for each finding:
   - From finding.criterion tag (e.g., "security", "performance")
   - From finding description keywords:
     - Security keywords: "auth", "encrypt", "sanitize", "injection", "xss", "csrf"
     - Performance keywords: "slow", "cache", "optimize", "memory", "cpu"
     - API keywords: "endpoint", "route", "request", "response", "status code"
     - Accessibility keywords: "aria", "screen reader", "contrast", "keyboard"
   - Default: "general" (no domain owner)

3. Apply filtering rules:
   
   Rule 1: Domain owner present → filter out-of-domain
   - If finding.domain has an owner in active panel
   - AND finding.reviewer != owner
   - THEN: Discard finding
   - Log: "Filtered: {reviewer} critique on {domain} — {owner} is domain authority"
   
   Rule 2: No domain owner → allow all critiques
   - If finding.domain has no owner in active panel
   - THEN: Keep finding
   - Reason: No authority to defer to, general critique allowed
   
   Rule 3: Owner critiquing own domain → always allow
   - If finding.reviewer == owner
   - THEN: Keep finding
   - Note: Owner's findings are authoritative

4. Special cases:
   - Multiple owners for overlapping domains: Use most specific match
   - Finding spans multiple domains: Split into separate findings per domain
   - Domain detection uncertain (confidence < 70%): Keep finding, flag as "uncertain domain"

   Example:

Panel: [engineering-security-engineer, engineering-senior-developer, design-ux-architect] Findings:

engineering-security-engineer: src/auth.ts:45 "Missing input sanitization" [BLOCKER, security]
engineering-senior-developer: src/auth.ts:45 "Auth logic should use bcrypt" [WARNING, security]
engineering-senior-developer: src/auth.ts:50 "Variable naming unclear" [SUGGESTION, general]
design-ux-architect: src/auth.ts:45 "Form lacks aria-label" [WARNING, accessibility]

After filtering:

Keep: engineering-security-engineer (owner of security domain)
Filter: engineering-senior-developer on security domain (owner present)
Keep: engineering-senior-developer on general domain (no owner)
Keep: design-ux-architect on accessibility domain (owner present, is owner)


---

## Step 2.7: INTENT FILTERING (conditional)

If REVIEW_MODE === "security-only":

1. **Filter Findings by Domain**
- For each finding from all agents:
  - Check finding.category or finding.domain
  - Include only if matches security domains:
    - security, owasp, stride, authentication, authorization
    - vulnerability, injection, xss, csrf, etc.
- Exclude: performance, accessibility, code-style, ui-ux findings

2. **Apply Authority Filtering**
- Per AUTH-04 from Phase 37
- If engineering-security-engineer provided finding → KEEP
- If testing-api-tester provided security finding → KEEP
- If design-ui-designer provided security opinion → DISCARD (out-of-domain)

3. **Generate Intent Filter Report**
```markdown
## Intent Filtering Report

**Mode:** security-only
**Domains:** security, owasp, stride, authentication, authorization

**Filtering Applied:**
- Raw findings: 47
- Security domain findings: 12
- After authority filtering: 10
- Excluded: 37 (non-security domains)

Proceed with filtered findings to Step 3 (Deduplication)

Step 4: Group by domain lens Organize findings by each reviewer's rubric focus area.

Note: Some findings may have been filtered in Step 3. The domain lens grouping only includes findings that passed authority filtering.

{Rubric Name} — {agent-id} (Domain Authority: {domains})

Verdict: {PASS | NEEDS WORK | FAIL} Authority Note: {agent-id} is domain authority for {domains} — findings are authoritative

#	Severity	File	Criterion	Issue
1	BLOCKER	path	{criterion_name}	{issue}

Step 5: Identify cross-cutting themes Scan across all domain groupings for patterns:

Multiple reviewers flagging the same file (from different criteria) → "Hot spot"
Findings clustering around the same success criterion → "Criterion at risk"
All reviewers passing a particular area → "Strong area"

Cross-Cutting Themes

Hot spots: Files flagged by 2+ reviewers: {file list with finding counts}
Criteria at risk: Success criteria with 2+ findings against them: {list}
Strong areas: Aspects with no findings from any reviewer: {list}

Step 6: Compute aggregate verdict

PASS: No BLOCKERs, no WARNINGs, all reviewers gave PASS
NEEDS WORK: Has BLOCKERs or WARNINGs, at least one reviewer gave NEEDS WORK
FAIL: Any reviewer gave FAIL, or 3+ BLOCKERs across reviewers

Step 7: Produce consolidated report Display the synthesis to the user:

Review Panel Synthesis — Phase {N}: {phase_name}

Panel: {count} reviewers across {domains} domain(s) Aggregate Verdict: {PASS | NEEDS WORK | FAIL}

Summary

Metric	Count
Total findings	{N}
Blockers	{N}
Warnings	{N}
Suggestions	{N}

{Domain lens groupings from Step 4}

{Cross-cutting themes from Step 5}

Panel Verdicts

Reviewer	Rubric Focus	Verdict	Key Finding
{agent-id}	{rubric_name}	{verdict}	{most critical finding or "No issues"}

Deferred Findings (MEDIUM Confidence)

{count} findings were flagged at MEDIUM confidence (50-79%) and excluded from the actionable report. These may warrant review if HIGH-confidence findings are sparse or if the user requests the full report.

#	Confidence	Reviewer	File	Issue
1	MEDIUM (65%)	{agent-id}	path	{issue}

Authority Filtering Report

Metric	Count
Findings before filtering	{N}
Out-of-domain critiques filtered	{M}
Domain owner findings kept	{K}
General domain findings kept	{L}

Filtered Findings (for reference)

Reviewer	File	Issue	Filtered Because
engineering-senior-developer	src/auth.ts:45	"Auth logic"	engineering-security-engineer is domain owner

The aggregate verdict and must-fix list then feed back into the standard review-loop cycle (Section 5: Fix Cycle if NEEDS WORK, Section 7 if PASS, Section 8 if escalated after 3 cycles).

3.6 Security-Only Output Generation

If REVIEW_MODE === "security-only":

Write: .planning/security-review-{timestamp}.md
Include: Security findings only, prioritized by severity
Format: Standard finding blocks with OWASP/STRIDE categorization
Cross-reference: Map each finding to OWASP Top 10 category and STRIDE threat type


---

## References

This skill extends patterns defined in:

| Pattern | Source | Used In |
|---------|--------|---------|
| Recommendation Algorithm | agent-registry.md Section 3 | Section 1 (panel composition) |
| Review Prompt Construction | review-loop.md Section 3 | Section 2 (rubric injection point) |
| Feedback Collection | review-loop.md Section 4 | Section 3 (synthesis dedup) |
| Mandatory Roles | agent-registry.md Section 3, Step 5 | Section 1, Step 4 (testing agent required) |
| Memory Boost | agent-registry.md Section 3, Step 6 | Section 1, Step 2 (optional scoring boost) |