name: p1-spec-research-policy description: "Quality criteria, review protocols, naming conventions, artifact format specifications, and checklists for the Phase 1 research pipeline. Pure reference — no orchestration." user-invocable: false
Phase 1 Research Policy
Core Principles
AskUserQuestion-First
Every ambiguity, design choice, or scope decision MUST be resolved via AskUserQuestion BEFORE proceeding. Do not assume — ask. The cost of asking is low; the cost of a wrong assumption cascades to all later phases.
Structured Interview Protocol (before spec parsing)
Before analyzing spec documents, conduct a structured user interview to understand intent, priorities, and constraints. Ask one question per message — do not batch.
Interview sequence (adapt to context, skip if already clear from spec):
- Goal: "What is the primary purpose of this design? What problem does it solve?"
- Scope: "Which features from the spec are in-scope for this implementation? Any intentional omissions?"
- Constraints: "Target frequency? Area budget? Power envelope? Technology node?"
- Priority: "If trade-offs arise (area vs performance vs power), which takes precedence?"
- Verification: "What is the verification strategy? cocotb/UVM? Formal? Target coverage?"
- Dependencies: "Any existing IP/modules to integrate? Reference models to match?"
Record answers in docs/phase-1-research/design-intent.md. These answers become
the interpretive context for all spec parsing — ambiguous spec language is resolved
using the user's stated intent, not agent assumptions.
Approach Comparison for Open Items
When the spec allows multiple implementation paths (algorithm choices, architecture options, protocol selections), present structured comparisons to the user:
## OPEN-1-NNN: {topic}
| Approach | Pros | Cons | Area Est. | Latency Est. | Recommendation |
|----------|------|------|-----------|-------------|----------------|
| A: {name} | ... | ... | ... | ... | |
| B: {name} | ... | ... | ... | ... | ★ Recommended |
| C: {name} | ... | ... | ... | ... | |
Trade-off summary: {1-2 sentences}
Ask user to select via AskUserQuestion. Record choice + rationale in open-requirements.json
resolution_rationale field.
Incremental Requirement Approval
Do NOT present all requirements at once. Group by functional area and seek approval in stages:
- Present interface/IO requirements first (ports, protocols, clocks) → user approves
- Present functional requirements by block → user approves per block
- Present performance requirements (timing, throughput, area) → user approves
- Present open items with approach comparisons → user selects
At each stage, the user can correct misinterpretations before they propagate. Only after all stages are approved, finalize iron-requirements.json.
Domain-Consult-First
Actively invoke domain-consult to acquire domain expert knowledge on algorithms, standards, coding tools, filter characteristics, and HW implementation trade-offs. Do not research in isolation. Domain experts provide knowledge; spec-analyst captures results as structured artifacts.
Propose, Do Not Decide
Present algorithm/tool candidates with trade-offs. Let the user make final selections. Architecture-level decisions (pipeline, block partitioning, memory hierarchy) are Phase 2's responsibility. Phase 1 surveys and recommends; Phase 2 designs.
Exhaustive Tree Exploration
Spawn maximum agents in parallel to explore all solution paths. Every feasible approach must be investigated and compared before committing. Skip ONLY if user specifies exact algorithm + architecture (even then, explore at least 2 variants for validation).
Spec Refinement Criteria
AskUserQuestion MUST cover these areas (skip items already provided by user):
- Target codec, profile, level (e.g., H.264 High Profile Level 4.1)
- Target resolution and framerate (e.g., 1080p@60fps, 4K@30fps)
- Encoder, decoder, or both
- Interface protocol (AXI4, AXI4-Lite, APB, custom)
- Clock frequency target and process node (ASIC vs FPGA)
- Feature scope restrictions (e.g., "TQ only", "intra-only")
- Priority trade-off preference (throughput vs area vs power vs quality)
3-Round Chief Review Protocol
Mandatory 3 rounds, coordinated by rtl-architect (domain-agnostic default). If a domain chief exists (e.g., vcodec-chief-standard-expert for video-codec domain), invoke both rtl-architect AND domain chief for domain-specific validation:
- Round 1: Cross-block data flow completeness, dependencies, performance constraints,
fixed-point constraints, cross-block issues, [AMBIGUITY]/[CONFLICT] status
Save:
reviews/phase-1-research/research-review-r1.md - Round 2: Convergence assessment. Rebuttal: spec-analyst accepts/rejects each Round 1 finding with rationale. Even if converged, proceed to Round 3
Save:
reviews/phase-1-research/research-review-r2.md - Round 3: Mandatory final quality pass. Remaining gaps → escalate via AskUserQuestion
Save:
reviews/phase-1-research/research-review-r3.md
Review criteria per round:
- Data flow: inputs/outputs defined at every block boundary
- Dependencies: which block produces/consumes what data
- Performance: throughput, latency, bandwidth as specific numbers
- Fixed-point: bit widths, rounding modes per block
- Cross-block issues: RDOQ↔Entropy dependency, ME↔MC pipeline, etc.
- Ambiguities: all resolved or promoted to [ARCHITECTURE_DECISION]
User may override round count: "set iterations to N" → N rounds (minimum 1).
Iron/Open Requirement Taxonomy
Phase 1 produces TWO requirement files instead of a single requirements.json:
iron-requirements.json — Settled Requirements (Authority = 1)
Located at docs/phase-1-research/iron-requirements.json. Contains functional and
performance requirements that are binding constraints for ALL downstream phases.
Each iron requirement MUST have:
"id":"REQ-F-NNN"(functional) or"REQ-P-NNN"(performance) — unique, sequential"type":"functional"or"performance""description": what the requirement is"priority":"must"|"should"|"may""source":{"document": "...", "section": "...", "line": N}for traceability"acceptance_criteria": array of measurable criteria (reject vague terms like "should support", "adequate", "sufficient")"violation_policy":"user_escalation"(all P1 iron requirements use this)
open-requirements.json — Research Homework for Phase 2
Located at docs/phase-1-research/open-requirements.json. Contains research topics
that Phase 2 must investigate and resolve into architecture decisions.
Each open item MUST have:
"id":"OPEN-1-NNN"— sequential"topic": what needs to be investigated"context": why this is an open question"candidates": array of ≥ 2 candidates (single candidate = not a research topic)"evaluation_criteria": metrics Phase 2 should use for comparison"related_iron": array of REQ-F/REQ-P IDs that constrain this research"resolution_expected": how this should be resolved in Phase 2
Classification Rules
- Functional/performance requirements with clear, measurable acceptance_criteria → iron
- Architecture/implementation choices needing further investigation → open
- Items with ambiguity score > 0.5 → CANNOT become iron until clarified
- A requirement cannot become iron until its ambiguity score passes (reproducibility check)
Iron/Open Classification Verification
After iron/open files are produced, verify:
FAIL conditions (must fix before exit):
- acceptance_criteria contains vague terms ("should support", "adequate", "sufficient")
- open item missing evaluation_criteria
- open item has candidates.length ≤ 1
- iron item missing violation_policy
WARN conditions (log and proceed):
- iron ratio < 30% (most items pushed to open — weakens Phase 1 value)
- open item related_iron is empty
- CONDITIONAL PASS ambiguity axis linked to an iron-classified REQ
Port Naming Conventions (io_definition.json)
- Inputs:
i_prefix (e.g.,i_data,i_valid) — NOT suffix_i - Outputs:
o_prefix (e.g.,o_result,o_ready) — NOT suffix_o - Bidirectional:
io_prefix (e.g.,io_sda) - Clocks:
clk(single domain) or{domain}_clk(e.g.,sys_clk) — NOTclk_i - Resets:
rst_n(single domain) or{domain}_rst_n(e.g.,sys_rst_n) — NOTrst_ni - Single clock domain defaults to
sys_clk/sys_rst_n
Self-Verification Format
Save to reviews/phase-1-research/research-review.md:
# Phase 1 Review: Research Completeness
- Date: YYYY-MM-DD
- Reviewer: spec-analyst
- Upper Spec: specs/
- Verdict: PASS | FAIL
## Feature Coverage Checklist
| Spec Section | Requirement ID | Status |
## Findings
### [severity] Finding-N: ...
## Verdict
PASS | FAIL: [reason]
Spec Feature Completeness Audit
Phase 1 spec analysis MUST enumerate ALL features defined in the specification and track their implementation status throughout the pipeline:
Feature enumeration: Extract every algorithm, mode, format, or capability from the spec
- Example: intra prediction modes, encoding modes, color formats, block sizes
- Assign each feature a REQ-F-* ID in iron-requirements.json
Reference model coverage check (if ref model exists at P1 or provided externally):
- Compare spec feature list against ref model implementation
- enum/define declarations vs actual function implementations
- "Enum declared but function not implemented" → COVERAGE_GAP warning
Gap escalation: When feature coverage < 100%, MUST ask user via AskUserQuestion:
- "Spec defines N features but model implements M. Omitting K features may reduce [quality metric]. Approve omission?"
- User-approved omissions → record in ADR with rationale and impact estimate
- Unapproved omissions → feature stays in iron-requirements as MUST_IMPLEMENT
Documentation: Save
docs/phase-1-research/feature-coverage.md:| Feature | Spec Count | Model Count | Coverage | Status | |---------|-----------|-------------|----------|--------| | Intra modes | 8 | 4 | 50% | USER_APPROVED / MUST_IMPLEMENT |
Escalation & Stop Conditions
- Spec document not found → report to user, halt
- Conflicting requirements between experts → flag conflict in domain-analysis.md, ask user
- Chief not converged after 3 rounds → escalate remaining gaps to user with specific questions
- Sub-domain expert returns [DOMAIN_UNCERTAINTY] → AskUserQuestion before proceeding
Ambiguity Score Protocol
Every Phase 1 completion MUST include an ambiguity assessment:
- spec-analyst produces
Ambiguity_Assessmentwith per-axis scores - Ambiguity Gate enforced by both orchestrators:
- p1-research-orchestrator: Step 7.5
- p1-research-team-orchestrator: Step 3.5
- Score is recorded in
docs/phase-1-research/ambiguity-assessment.md - Phase 2 entry reads this score — if > 0.3, phase 2 reviewers prioritize clarifying those axes
This is inspired by Ouroboros's AmbiguityScorer pattern:
- Goal Ambiguity (40%): Is the design objective ambiguous? (0.0=clear, 1.0=ambiguous)
- Constraint Ambiguity (30%): Are timing/area/power/protocol constraints missing? (0.0=explicit, 1.0=missing)
- AC Ambiguity (30%): Are acceptance criteria untestable? (0.0=testable, 1.0=untestable)
Scoring: ambiguity_score = weighted_average(goal, constraint, ac) — higher = worse
- ≤ 0.3: PASS — proceed to Phase 2
- 0.3–0.5: CONDITIONAL PASS — log warnings, Phase 2 reviewers focus on flagged axes
- > 0.5: BLOCK — resolve top ambiguities via AskUserQuestion before proceeding
Adversarial Interpretation Gate (Steps 7.6-7.9)
After ambiguity gate (Step 7.5a) and iron/open verification (Step 7.5b) pass, run adversarial reinterpretation to surface ambiguities the initial analysis missed.
Protocol
- Step 7.6: Spawn adversarial spec-analyst (separate Task, clean context) to challenge
iron-requirements.json. References items by
source.section, not REQ ID. Output:challenge-report.jsonin.rat/scratch/stability/phase-1/. Schema:skills/p1-spec-research/templates/challenge-report-schema.json. Budget: max 30 challenges per pass. - Step 7.7: Present HIGH challenges to user (AskUserQuestion). MEDIUM batched if >10. LOW auto-documented. User may mark challenges as NOT_GENUINE (forced disagreements).
- Step 7.8: Re-run spec-analyst with original spec + clarifications → all 4 canonical artifacts
(iron-requirements.json, open-requirements.json, io_definition.json, timing_constraints.json)
- self-validation.
- Step 7.9: Gate check + stability report.
Gate Metric
genuine = (HIGH + MEDIUM challenges) - NOT_GENUINE
resolved = RESOLVED + DOCUMENTED
resolution_ratio = resolved / genuine (if genuine == 0: pass)
gate_pass = (all HIGH resolved) AND (resolution_ratio ≥ 0.8)
Gate failure: list unresolved HIGH challenges, loop back to Step 7.7 (max 1 re-loop). After 2nd failure: escalate to user with full divergence report.
Dual Gate Arbitration (Ambiguity Score + Adversarial Gate)
| Ambiguity Score | Adversarial Gate | Decision |
|---|---|---|
| PASS (≤0.3) | PASS | Proceed |
| PASS (≤0.3) | FAIL | BLOCK |
| CONDITIONAL (0.3-0.5) | PASS | Proceed with WARNING |
| CONDITIONAL (0.3-0.5) | FAIL | BLOCK |
| BLOCK (>0.5) | PASS | BLOCK |
| BLOCK (>0.5) | FAIL | BLOCK |
Rule: Either gate can block; neither can unblock the other.
Severity Classification
| Severity | Criterion | Example |
|---|---|---|
| HIGH | Different RTL behavior | Signed vs unsigned arithmetic |
| HIGH | Different interface | 32-bit vs 64-bit datapath |
| MEDIUM | Different parameterization | Fixed depth vs configurable |
| MEDIUM | Different timing | 3-stage vs 4-stage pipeline |
| LOW | Cosmetic only | Block naming differences |
Boundary rule: alternative interpretation would cause different RTL module → HIGH. Same module but different parameters → MEDIUM. Same module, same parameters → LOW.
Pathological Patterns
- Zero challenges on >15 requirements → re-run with stronger adversarial framing
50% items at HIGH severity → spec fundamentally under-specified, escalate
- Challenge budget exceeded (>30) → rank by severity, return top 30
Final Checklist
-
docs/phase-1-research/iron-requirements.jsonexists and is valid JSON -
docs/phase-1-research/open-requirements.jsonexists and is valid JSON - Every requirement has unique
"id": "REQ-NNN"field -
docs/phase-1-research/io_definition.jsonexists and is valid JSON - io_definition.json port names use
i_/o_/io_prefix (NOT suffix) - io_definition.json clocks use
{domain}_clk, resets use{domain}_rst_n -
docs/phase-1-research/timing_constraints.jsonexists with per-block timing targets (rough estimates) -
docs/phase-1-research/domain-analysis.mdexists with cross-block dependency matrix and per-block timing targets - No unresolved requirement conflicts
- Review coordinator (rtl-architect, or domain chief if available) declared Architecture-Ready (or gaps escalated)
- Self-verification verdict produced (PASS or REVIEW_NEEDED)
- Spec feature count vs iron-requirements.json + open-requirements.json count documented
-
reviews/phase-1-research/research-review.mdsaved (consolidated) - Per-round review artifacts saved: research-review-r1.md, r2.md, r3.md
-
docs/phase-1-research/solution-tree.jsonexists (structured JSON) -
docs/phase-1-research/candidate-comparison.mdexists -
docs/phase-1-research/selected-approach.mdexists -
docs/phase-1-research/literature-survey.mdexists - Tree exploration used maximum parallel agents (8-20 leaf + cross-cutting)
- domain-consult invoked at least once
- Algorithm/tool candidates presented with trade-offs (NOT pre-selected)
- AskUserQuestion used at every ambiguity point (no unresolved assumptions)
-
docs/phase-1-research/ambiguity-assessment.mdsaved with per-axis scores and overall ambiguity_score - Ambiguity Gate passed (score ≤ 0.3 for PASS, 0.3–0.5 for CONDITIONAL PASS)
- Adversarial reinterpretation completed (Step 7.6)
- All HIGH challenges resolved or escalated
- resolution_ratio ≥ 0.8 (adversarial gate PASS)
-
reviews/phase-1-research/stability-report.mdsaved - Every iron requirement has measurable acceptance_criteria (no vague terms)
- Every iron requirement has
"violation_policy": "user_escalation" - Every open item has ≥ 2 candidates and evaluation_criteria
- File-level target_phase specified in open-requirements.json
- Iron/open classification verification passed (no FAIL conditions)