Explore AI Agent Skills & Claude Prompts

star 3

Use when executing validation gates 3A, 4A, or 6A-E, verifying phase completion with expo-mcp visual testing, or encountering test failures - automates gate execution with expo-mcp autonomous verification and HARD STOP enforcement

schedule Updated 7 months ago

consensus-engine

star 2

Multi-validator agreement gate. N independent validators against same feature; synthesize confidence-scored verdict.

schedule Updated 1 month ago

consensus-engine

star 2

Multi-validator agreement gate. N independent validators against same feature; synthesize confidence-scored verdict.

schedule Updated 1 month ago

goal-condition-architect

Transform any input into a single transcript-provable /goal completion condition. ALWAYS use when the user says "make a /goal", "set a goal", "goal condition", "transcript-provable end state", "airtight finish line", "design completion criteria", or "turn this into an autonomous run". Produces the four-part anatomy (end-state, checks, constraints, bound) and runs an adversarial harden pass before handing off.

schedule Updated 23 days ago

plan-author

Linear hierarchical plan author producing plan.md + phase-NN.md files. Plans-as-prompts framing — PLAN.md IS the prompt that executes the phase. ALWAYS use when the user says "plan this", "create plan", "implementation plan", "write a plan", "draft plan", or invokes /shannon:plan, /shannon:plan-author, /shannon:plan-author, /shannon:plan-deep, or /shannon:prd. Each phase has measurable transcript-provable success criteria and an embedded validation gate. Scope-atomic (2-3 tasks max per file).

schedule Updated 23 days ago

playwright-validation

Use for web feature validation via Playwright MCP — real browser interactions, cross-browser support (Chromium/Firefox/WebKit), screenshot + DOM snapshot evidence capture, form testing, responsive layouts, console/network error detection. Reach for it when you've picked Playwright as your browser tool (vs Chrome DevTools MCP), or on phrases like 'Playwright validation', 'validate with Playwright', 'browser feature test', 'cross-browser check', 'take DOM snapshot', 'browser automation for validation'. For performance-focused Chrome inspection use chrome-devtools; for the overall web-validation flow use web-validation.

ios-validation-runner

Use for deep iOS validation of multi-step user flows where screenshots alone miss the timing — animations, loading states, state transitions, anything where the journey matters more than the endpoints. Runs a five-phase protocol (SETUP → RECORD → ACT → COLLECT → VERIFY) that captures video + logs in the background while you interact with the app, then analyzes everything together for a PASS/FAIL verdict. Reach for it when you need richer evidence than ios-validation-gate provides, when debugging state transitions, or when someone asks 'what actually happens between tap and result'.

consensus-engine

Use when a single-validator PASS is not enough confidence — high-stakes features (payments, auth, data migrations, security surfaces), pre-ship release gates, regression review on large refactors, flake hunting, and audit trails for regulated work. Spawns N (≥2, default 3) independent validator agents against the same journey list, each with its own isolated evidence subdirectory, then synthesizes their per-journey verdicts into a single consensus verdict with a confidence score (UNANIMOUS → HIGH, MAJORITY → MEDIUM, SPLIT → LOW). Disagreements trigger root-cause investigation before the final verdict is emitted. Reach for it on phrases like 'consensus validation', 'multi-agent verdict', 'get a second opinion', 'validate with N agents', 'pre-ship gate', 'confidence-scored verdict', 'agreement-based review', or when you want to catch flaky behavior with parallel independent runs. Not for coverage fan-out (use parallel-validation or forge-team); not without a validation plan (run create-validation-plan first); n

consensus-engine

Multi-validator agreement gate with 5-state synthesis and multi-round debate iteration. ALWAYS use when the user says "consensus validation", "validate with N reviewers", "agreement gate", "high-confidence validation", "consensus gate", or needs confidence-scored verdicts. Spawns ≥2 (default 3) independent validators in isolated evidence dirs, applies the 5-state synthesis table (UNANIMOUS_PASS / UNANIMOUS_FAIL / MAJORITY_PASS / MAJORITY_FAIL / SPLIT), and escalates SPLIT or borderline MAJORITY to up to 3 rounds of filesystem-mediated debate.

schedule Updated 25 days ago

flutter-validation

Use for validating Flutter apps on Android emulators, iOS simulators, and connected physical devices. Runs the protocol: flutter doctor check, pub get, analyze, build (APK/AAB for Android, .app/.ipa for iOS), install on device/emulator, launch, screenshot captures at key states, log streaming via flutter logs for os_log/adb logcat, crash detection via error markers in the log stream. Pairs with e2e-validate for orchestration or runs standalone for Flutter-only projects. Reach for it on phrases like 'validate my Flutter app', 'flutter run check', 'test on simulator', 'dart/flutter test failed', 'pub get validation', or before any Flutter release.

team-validation-dashboard

Use for organization-wide visibility across multiple projects' validation health — not for a single project (use forge-benchmark for that). Aggregates posture scores, coverage %, regression trends, and journey ownership across all registered projects in a team's portfolio. Flags critical projects (score < 60) for attention. Reach for it on phrases like 'team dashboard', 'show validation across all projects', 'which projects need attention', 'quarterly validation review', 'who owns journey X', or for CI/CD reporting in multi-project orgs.

consensus-synthesis

Synthesize N per-validator verdicts into a single consensus verdict with confidence scoring based on agreement level.