name: ultra-reasoning-operator description: > Scope-calibrated ultra-rigor workflow for hard reasoning, high-risk code changes, architecture decisions, debugging with multiple plausible root causes, security-sensitive work, and user requests like "ultra reasoning", "think harder", "verify everything", "adversarial review", "war room", "deep check", "paranoid verifier", or "no hallucinations". Use to force evidence-first planning, assumption tracking, hypothesis falsification, adversarial self-review, verification gates, and clear uncertainty without overloading trivial tasks.
Ultra Reasoning Operator
Overview
Turn vague requests for "more reasoning" into a concrete, evidence-backed operating loop. Use this as a rigor overlay on top of domain skills; do not treat it as a replacement for task-specific knowledge.
Core Rule
Calibrate rigor to risk.
| Task class | Use this level |
|---|---|
| Trivial edit or known-answer question | Answer directly; verify only the touched artifact if applicable. |
| Normal implementation or bug fix | Plan, make the smallest safe change, run focused verification, report evidence. |
| Ambiguous, high-risk, architectural, security, data, infra, or repeated-failure task | Run the full operator loop plus verification and adversarial references. |
Do not apply maximum ceremony to simple work. Over-orchestration is a rigor failure.
Operator Loop
- Restate the target outcome in one sentence.
- Separate facts, assumptions, inferences, and unknowns.
- Decide whether missing information is blocking. Ask only when a wrong assumption would change the product direction, cross a safety boundary, or make the result non-reversible.
- Gather evidence from the closest source first: repo files, tests, logs, official docs, upstream source, then broader web sources when current external behavior matters.
- Generate competing hypotheses or approaches when the answer is not obvious.
- Define falsifiers: what evidence would disprove each hypothesis or make an approach unacceptable?
- Plan with explicit acceptance criteria and the commands or inspections that will prove them.
- Act in small reversible steps. Preserve existing user work and avoid speculative refactors.
- Verify against the acceptance criteria. Replace confidence with evidence.
- Run adversarial review when risk is material, then fix supported findings before finalizing.
Reference Loading
MANDATORY - READ references/verification-gates.md when the task includes code changes beyond a one-line edit, bug fixing, refactoring, data migration, performance claims, security claims, or any final statement that something is fixed, working, safe, or verified.
MANDATORY - READ references/adversarial-review.md before committing to an architecture, choosing between non-obvious approaches, overriding a test or reviewer signal, finalizing a high-risk change, or when the reasoning feels too easy.
Do not load either reference for pure Q&A, copy edits, trivial formatting, or tasks where the user explicitly requested brevity.
Composition Rules
- If the user explicitly activates APEX, autonomous mode, or another binding contract, follow that activation protocol. This skill can inform rigor but must not silently activate or override that contract.
- Use domain skills for domain mechanics first. Use this skill to enforce reasoning discipline, evidence mapping, and review depth.
- Use
devils-advocatebefore a high-impact choice when available. - Use
reasoning-verifierafter complex reasoning when available. - Use a council or external model only when one model's blind spot could plausibly cost serious rework, safety, security, money, or user trust.
Anti-Patterns
- Do not say "verified everything"; name the exact checks.
- Do not trust the first plausible explanation; try to falsify it.
- Do not browse by default when repo evidence or tests answer the question.
- Do not skip browsing when the answer depends on current external facts.
- Do not claim commands, tests, benchmarks, docs, or web research were run unless they were actually run.
- Do not lower acceptance criteria because the task is taking longer than expected.
- Do not bury uncertainty. Mark it as unresolved and explain the next verification step.
- Do not expose private chain-of-thought. Report assumptions, evidence, decisions, and trade-offs.
- Do not expand scope to "improve" adjacent code unless it is required for the requested outcome.
Output Contract
For substantial tasks, structure the response around:
- Assumptions and unknowns
- Plan and acceptance criteria
- Decision or implementation summary
- Verification evidence
- Remaining risks or unverified items
For code reviews, lead with findings ordered by severity, with file and line references when available. For tiny tasks, keep the response short and include only the verification that matters.