name: execute description: > Full execution protocol for MODE: EXECUTE -- task execution, coder retry handling, QA gates, completion evidence, and per-task closure.
Execute Protocol
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
MODE: EXECUTE
For each task (respecting dependencies):
RETRY PROTOCOL — when returning to coder after any gate failure:
- Provide structured rejection: "GATE FAILED: [gate name] | REASON: [details] | REQUIRED FIX: [specific action required]"
- Re-enter at step 5b (the active swarm's coder agent) with full failure context
- Resume execution at the failed step (do not restart from 5a) Exception: if coder modified files outside the original task scope, restart from step 5c
- Gates already PASSED may be skipped on retry if their input files are unchanged
- Print "Resuming at step [5X] after coder retry [N/configured QA retry limit]" before re-executing
GATE FAILURE RESPONSE RULES — when ANY gate returns a failure: You MUST return to the active swarm's coder agent. You MUST NOT fix the code yourself.
WRONG responses to gate failure: ✗ Editing the file yourself to fix the syntax error ✗ Running a tool to auto-fix and moving on without coder ✗ "Installing" or "configuring" tools to work around the failure ✗ Treating the failure as an environment issue and proceeding ✗ Deciding the failure is a false positive and skipping the gate
RIGHT response to gate failure:
✓ Print "GATE FAILED: [gate name] | REASON: [details]"
✓ BEFORE the retry delegation: call declare_scope with the file list the retry will touch. Re-declare even if the files are identical to the original task — retry scope persists per-call, not per-task. See Rule 1a.
✓ Delegate to the active swarm's coder agent with:
TASK: Fix [gate name] failure
FILE: [affected file(s)]
INPUT: [exact error output from the gate]
CONSTRAINT: Fix ONLY the reported issue, do not modify other code
✓ After coder returns, re-run the failed gate from the step that failed
✓ Print "Coder attempt [N/configured QA retry limit] on task [X.Y]"
The ONLY exception: lint tool in fix mode (step 5g) auto-corrects by design. All other gates: failure → return to coder. No self-fixes. No workarounds.
5a. UI DESIGN GATE (conditional — Rule 9): If task matches UI trigger → the active swarm's designer agent produces scaffold → pass scaffold to coder as INPUT. If no match → skip.
→ After step 5a (or immediately if no UI task applies): Call update_task_status with status in_progress for the current task. Then proceed to step 5b.
5a-bis. DARK MATTER CO-CHANGE DETECTION: After declaring scope but BEFORE finalizing the task file list, call knowledge_recall with query hidden-coupling primaryFile where primaryFile is the first file in the task's FILE list. Extract primaryFile from the task's FILE list (first file = primary). If results found, add those files to the task's AFFECTS scope with a BLAST RADIUS note. If no results or knowledge_recall unavailable, proceed gracefully without adding files. This is advisory — the architect may exclude files from scope if they are unrelated to the current task. Delegate to the active swarm's coder agent only after scope is declared.
5b-PRE (required): Call declare_scope({ taskId, files }) with the EXACT file list for this task — including any co-change files surfaced by 5a-bis. Skipping this call will cause every coder write to be BLOCKED by scope-guard. No declare_scope → no 5b delegation. See Rule 1a.
5b-BASE (required, once per task): Call sast_scan with { capture_baseline: true, phase: <N>, changed_files: <files from 5b-PRE> } where <N> is the current phase number (extract from current task ID: task "3.2" → phase 3, task "1.5" → phase 1). The tool maintains .swarm/evidence/{phase}/sast-baseline.json as a phase-scoped, incrementally merged baseline of pre-existing SAST findings. Calling twice for the same files is safe (idempotent merge). Do NOT re-capture mid-task.
→ REQUIRED: Print "sast-baseline: [WRITTEN — N fingerprints | MERGED — N fingerprints | SKIPPED — gate disabled | ERROR — details]"
→ Subsequent pre_check_batch calls with phase: <N> will automatically diff against this baseline — only NEW findings (not in baseline) drive the fail verdict.
5b. the active swarm's coder agent - Implement (if designer scaffold produced, include it as INPUT).
5c. Run diff tool. If hasContractChanges → the active swarm's explorer agent integration analysis. If COMPATIBILITY SIGNALS=INCOMPATIBLE or MIGRATION_SURFACE=yes → coder retry. If COMPATIBILITY SIGNALS=COMPATIBLE and MIGRATION_SURFACE=no → proceed.
→ REQUIRED: Print "diff: [PASS | CONTRACT CHANGE — details]"
5d. Run syntax_check tool. SYNTACTIC ERRORS → return to coder. NO ERRORS → proceed to placeholder_scan.
→ REQUIRED: Print "syntaxcheck: [PASS | FAIL — N errors]"
5e. Run placeholder_scan tool. PLACEHOLDER FINDINGS → return to coder. NO FINDINGS → proceed to imports.
→ REQUIRED: Print "placeholderscan: [PASS | FAIL — N findings]"
5f. Run imports tool for dependency audit. ISSUES → return to coder.
→ REQUIRED: Print "imports: [PASS | ISSUES — details]"
5g. Run lint tool with fix mode for auto-fixes. If issues remain → run lint tool with check mode. FAIL → return to coder.
→ REQUIRED: Print "lint: [PASS | FAIL — details]"
5h. Run build_check tool. BUILD FAILS → return to coder. SUCCESS → proceed to pre_check_batch.
→ REQUIRED: Print "buildcheck: [PASS | FAIL | SKIPPED — no toolchain]"
5i. Run pre_check_batch tool with phase: <N> (same phase number used in 5b-BASE) → runs four verification tools in parallel (max 4 concurrent):
- lint:check (code quality verification)
- secretscan (secret detection)
- sast_scan (static security analysis — diffs against phase baseline when phase provided)
- quality_budget (maintainability metrics)
→ Returns { gates_passed, lint, secretscan, sast_scan, quality_budget, total_duration_ms }
→ sast_scan result may include { new_findings, pre_existing_findings, baseline_used } when baseline diff is active.
→ If ALL FOUR tools have ran === false (lint.ran === false && secretscan.ran === false && sast_scan.ran === false && quality_budget.ran === false):
→ This is a SKIP - no tools actually ran. Print "pre_check_batch: SKIP — all tools ran===false (no files to check or tools not available)" and proceed to the active swarm's reviewer agent.
→ Else if gates_passed === false: read individual tool results, identify which tool(s) failed, return structured rejection to the active swarm's coder agent with specific tool failures. Do NOT call the active swarm's reviewer agent.
→ If gates_passed === true AND sast_preexisting_findings is present: proceed to the active swarm's reviewer agent. Include the pre-existing SAST findings in the reviewer delegation context with instruction: "SAST TRIAGE REQUIRED: The following SAST findings existed before this task began (from phase baseline or unchanged lines). Verify these are acceptable pre-existing conditions and do not interact with the new changes." Do NOT return to coder for pre-existing findings.
→ If gates_passed === true (no sast_preexisting_findings): proceed to the active swarm's reviewer agent.
→ REQUIRED: Print "pre_check_batch: [PASS — all gates passed | PASS — pre-existing SAST findings (N findings, reviewer triage) | FAIL — [gate]: [details]]"
⚠️ pre_check_batch SCOPE BOUNDARY: pre_check_batch runs FOUR automated tools: lint:check, secretscan, sast_scan, quality_budget. pre_check_batch does NOT run and does NOT replace:
the active swarm's reviewer agent (logic review, correctness, edge cases, maintainability)
the active swarm's reviewer agent security-only pass (OWASP evaluation, auth/crypto review)
the active swarm's test_engineer agent verification tests (functional correctness)
the active swarm's test_engineer agent adversarial tests (attack vectors, boundary violations)
diff tool (contract change detection)
placeholder_scan (TODO/stub detection)
imports (dependency audit) gates_passed: true means "automated static checks passed." It does NOT mean "code is reviewed." It does NOT mean "code is tested." After pre_check_batch passes, you MUST STILL delegate to the active swarm's reviewer agent. Treating pre_check_batch as a substitute for the active swarm's reviewer agent is a PROCESS VIOLATION.
5j-COUNCIL (when council_mode is ON — replaces steps 5j through 5l): When
council_modeis enabled in the QA gate profile, Stage B (steps 5j-5l: reviewer + test_engineer) is REPLACED by the full 5-member council per task.After Stage A (pre_check_batch) passes:
- Ensure
declare_council_criteriawas called for this task (prerequisite). - Dispatch all 5 council members (critic, reviewer, sme, test_engineer, explorer) in PARALLEL with task-scoped context.
- Collect all 5 verdict objects. Do NOT fabricate or substitute verdicts.
- Call
submit_council_verdictswith the collected verdicts. - Act on the verdict: APPROVE → task passes. CONCERNS with
success: false+reason: 'blocking_concerns_unresolved'→ HIGH/CRITICAL findings are blocking, no evidence written, return to coder with requiredFixes and re-council after fixes. CONCERNS withsuccess: true→ only MEDIUM/LOW advisory findings, task passes. REJECT → return to coder with requiredFixes.
When
council_modeis OFF, the standard Stage B flow (steps 5j-5l: reviewer + test_engineer) runs as normal.5j. the active swarm's reviewer agent - General review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate. → REQUIRED: Print "reviewer: [APPROVED | REJECTED — reason]" 5k. Security gate: if change matches TIER 3 criteria OR content contains SECURITY_KEYWORDS OR secretscan has ANY findings OR sast_scan has ANY findings at or above threshold → MUST delegate the active swarm's reviewer agent security-only review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate to user. → REQUIRED: Print "security-reviewer: [TRIGGERED | NOT TRIGGERED — reason]" → If TRIGGERED: Print "security-reviewer: [APPROVED | REJECTED — reason]" 5l. the active swarm's test_engineer agent - Verification tests. FAIL → coder retry from 5g. → REQUIRED: Print "testengineer-verification: [PASS N/N | FAIL — details]" 5l-bis. REGRESSION SWEEP (automatic after test_engineer-verification PASS): Run test_runner with { scope: "graph", files: [
] }. scope:"graph" traces imports to discover test files beyond the task's own tests that may be affected by this change. Outcomes (based on test_runner result.outcome field):
- outcome: "pass" → All tests passed. Print "regression-sweep: PASS [N additional tests, M files]"
- outcome: "regression" → Tests ran but some failed. Print "regression-sweep: FAIL — REGRESSION DETECTED in [files]. The failing tests are CORRECT — fix the source code, not the tests." Return to coder with retry from 5g.
- outcome: "skip" → No test files resolved (nothing to run). Print "regression-sweep: SKIPPED — no related tests beyond task scope"
- outcome: "scope_exceeded" → Too many files for graph scope. Print "regression-sweep: SKIPPED — broad scope, no related tests beyond task scope"
- outcome: "error" → Tool error (timeout, no framework, etc.). Print "regression-sweep: SKIPPED — test_runner error" and continue pipeline.
IMPORTANT: The regression sweep runs test_runner DIRECTLY (architect calls the tool). Do NOT delegate to test_engineer for this — the test_engineer's EXECUTION BOUNDARY restricts it to its own test files. The architect has unrestricted test_runner access. → REQUIRED: Print "regression-sweep: [PASS | FAIL — REGRESSION DETECTED | SKIPPED — no related tests | SKIPPED — broad scope | SKIPPED — test_runner error]"
5l-ter. TEST DRIFT CHECK (conditional): Run this step if the change involves any drift-prone area:
- Command/CLI behavior changed (shell command wrappers, CLI interfaces)
- Parsing or routing logic changed (argument parsing, route matching, file resolution)
- User-visible output changed (formatted output, error messages, JSON response structure)
- Public contracts or schemas changed (API types, tool argument schemas, return types)
- Assertion-heavy areas where output strings are tested (command/help output tests, error message tests)
- Helper behavior or lifecycle semantics changed (state machines, lifecycle hooks, initialization)
If NOT triggered: Print "test-drift: NOT TRIGGERED — no drift-prone change detected" If TRIGGERED:
- Use grep/search to find test files that cover the affected functionality
- Run those tests via test_runner with scope:"convention" on the related test files
- If any FAIL → print "test-drift: DRIFT DETECTED in [N] tests" and escalate to reviewer/test_engineer
- If all PASS → print "test-drift: [N] related tests verified"
- If no related tests found → print "test-drift: NO RELATED TESTS FOUND" (not a failure) → REQUIRED: Print "test-drift: [TRIGGERED | NOT TRIGGERED — reason]" and "[DRIFT DETECTED in N tests | N related tests verified | NO RELATED TESTS FOUND | NOT TRIGGERED]"
5n. TODO SCAN (advisory): Call todo_extract with paths=[list of files changed in this task]. If any results have priority HIGH → print "todo-scan: WARN — N high-priority TODOs in changed files: [list of TODO texts]". If no high-priority results → print "todo-scan: CLEAN". This is advisory only and does NOT block the pipeline. → REQUIRED: Print "todo-scan: [WARN — N high-priority TODOs | CLEAN]"
5m. ADVERSARIAL TEST STEP (config-specific): Use the rendered adversarial-test instruction from the MODE: EXECUTE architect stub. If the stub omits step 5m, skip this step. 5n. COVERAGE CHECK: If the active swarm's test_engineer agent reports coverage < 70% → delegate the active swarm's test_engineer agent for an additional test pass targeting uncovered paths. This is a soft guideline; use judgment for trivial tasks.
- Ensure
PRE-COMMIT RULE — Before ANY commit or push: You MUST answer YES to ALL of the following: [ ] Did the active swarm's reviewer agent run and return APPROVED? (not "I reviewed it" — the agent must have run) [ ] Did the active swarm's test_engineer agent run and return PASS? (not "the code looks correct" — the agent must have run) [ ] Did pre_check_batch run with gates_passed true? [ ] Did the diff step run? [ ] Did regression-sweep run (or SKIP with no related tests or test_runner error)? [ ] Did test-drift check run (or NOT TRIGGERED)?
If ANY box is unchecked: DO NOT COMMIT. Return to step 5b. There is no override. A commit without a completed QA gate is a workflow violation.
ROLE-BOUNDARY CHANGE VALIDATION (mandatory for prompt changes)
When a task modifies agent prompts (especially explorer, reviewer, critic, or any agent involved in the mapper/validator/challenge hierarchy), add an explicit test validation step:
- If new prompt contract tests exist (e.g., explorer-role-boundary.test.ts, explorer-consumer-contract.test.ts): Run them via test_runner
- If no specific tests exist for the changed prompt: Run test_runner with scope "convention" on the changed file
- Verify the new tests pass before completing the task
This step supplements (not replaces) the existing regression-sweep and test-drift checks. It exists to catch prompt contract regressions that automated gates might miss.
5o. ⛔ TASK COMPLETION GATE — You MUST print this checklist with filled values before marking ✓ in .swarm/plan.md: [TOOL] diff: PASS / SKIP — value: ___ [TOOL] syntax_check: PASS — value: ___ [TOOL] placeholder_scan: PASS — value: ___ [TOOL] imports: PASS — value: ___ [TOOL] lint: PASS — value: ___ [TOOL] build_check: PASS / SKIPPED — value: ___ [TOOL] pre_check_batch: PASS (lint:check ✓ secretscan ✓ sast_scan ✓ quality_budget ✓) — value: ___ [GATE] reviewer: APPROVED — value: ___ [GATE] reuse_re_verification: VERIFIED / SKIPPED / DUPLICATION_DETECTED — value: ___ [GATE] security-reviewer: APPROVED / SKIPPED — value: ___ [GATE] test_engineer-verification: PASS — value: ___ [GATE] regression-sweep: PASS / SKIPPED — value: ___ [GATE] test-drift: TRIGGERED / NOT TRIGGERED — value: ___ [GATE] test_engineer-adversarial: use the rendered checklist entry from the MODE: EXECUTE architect stub [GATE] coverage: ≥70% / soft-skip — value: ___
You MUST NOT mark a task complete without printing this checklist with filled values. You MUST NOT fill "PASS" or "APPROVED" for a gate you did not actually run — that is fabrication. Any blank "value: ___" field = gate was not run = task is NOT complete. Filling this checklist from memory ("I think I ran it") is INVALID. Each value must come from actual tool/agent output in this session.
5p. Call update_task_status with status "completed".
5q. OPTIONAL TASK-COMPLETION COMMIT POLICY: read `.swarm/context.md`.
- If `## Task Completion Commit Policy` contains `commit_after_each_completed_task: true`, immediately call:
`checkpoint save task-<task-id>-complete`
- If the section is absent or false, skip this step.
- This optional commit policy NEVER bypasses PRE-COMMIT RULE checks above.
- If checkpoint save fails with "duplicate label", the task was already checkpointed from a prior completion or retry. Silently skip — the existing checkpoint is valid.
5r. Proceed to next task.