name: issue-tracer description: Evidence-first issue and bug investigation for Codex. Use when asked to trace, investigate, root-cause, plan, fix, close, or prepare a PR for a GitHub issue, bug report, regression, failing test, or confusing runtime behavior. Drives intake, reproduction, localization, critic review, implementation, validation, invariant-aware PR closure, and no-gap evidence capture.
Issue Tracer
Overview
Use this skill to take a GitHub issue or bug report from intake to a verified fix or a reviewed plan. Preserve evidence over polish: reproduce before localizing, localize before fixing, and validate the runtime path before declaring closure.
This is the Codex-native version. Use Codex tools and workflow defaults:
- Use
shell_commandfor repository commands,rg,git,gh, tests, builds, and local validation. - Use
apply_patchfor manual file edits. - Use
update_planfor phase tracking on substantial work. - Use the GitHub app or
ghfor issue and PR metadata when available. - Use
webonly for current external framework/API behavior, advisories, or release notes; cite URLs for external claims. - Use parallel reads/searches when independent files or subsystems can be inspected safely.
Mode Selection
Infer the mode from the user request and newest instructions.
plan-only: trace, reproduce/localize where possible, run critic review, and stop with a reviewed plan.plan-then-approval: produce a reviewed plan and wait for explicit approval before production-code edits.approved implementation: if the user already asked to fix or implement, continue through reproduction, localization, minimal patch, validation, and PR-ready summary.high-risk: require approval before edits when the fix is destructive, broad, breaking, migration-heavy, or depends on unavailable secrets/data.review-followup: if the user pastes PR review feedback, treat each finding as a claim to verify against the current branch or live PR head before editing. Classify items as confirmed, disproved, pre-existing, or unverified, and patch only the confirmed gaps.
Do not force a blocking approval gate for ordinary Codex implementation work when the user already asked for the fix. Do force it for plan-only requests, high-risk work, destructive operations, or explicit user instructions.
Repo Contract
For opencode-swarm, read the repo contract before meaningful work:
AGENTS.mddocs/engineering-invariants.mdwhen touching relevant invariants.opencode/skills/writing-tests/SKILL.mdbefore writing or modifying tests.opencode/skills/engineering-conventions/SKILL.mdbefore architecture, plugin init, subprocess, tool registration, plan durability,.swarmstorage, runtime portability, session/global state, guardrails/retry, chat/system hook, or release/cache work.agents/skills/commit-pr/SKILL.mdbefore commit, push, or PR creation; use.claude/skills/commit-pr/SKILL.mdas the underlying repo protocol it points to
If .Codex/session/swarm-mode.md exists, read it before complex work and follow its quality gates.
Every PR for this repo must include an invariant audit for touched areas. Evidence must be concrete: commands, test output, source inspection, or grep results.
Trace Artifacts
For deep issue tracing, create a resumable trace directory:
.Codex/issue-traces/<issue-id-or-slug>/
|-- 01-issue-summary.md
|-- 02-reproduction.md
|-- 03-localization-log.md
|-- 04-root-cause.md
|-- 05-fix-plan.md
|-- 06-critic-review.md
|-- 07-approved-plan.md
|-- 08-test-results.md
|-- 08b-implementation-review.md
|-- 09-final-critic.md
|-- 10-pr-body.md
`-- state.md
For small fixes, a compact in-thread evidence trail is acceptable unless the user requests trace artifacts, context may be long-running, or the issue is ambiguous/high-risk. When artifacts are used, update state.md at phase boundaries with phase, completed gates, active hypothesis, selected fix, risks, and next action.
Read the phase reference before using it:
references/evidence-artifacts.mdfor artifact templatesreferences/localization-playbook.mdfor root-cause localizationreferences/critic-gate.mdfor independent or fallback critic reviewassets/pr-template.mdfor PR-ready closure text
Phase 0: Setup
- Parse the issue URL, number, PR link, failing command, or bug description.
- Identify repo root, branch, remotes, and worktree safety with
git status --short. - Inspect project instructions, manifests, test configs, CI configs, and relevant local skills.
- Create trace artifacts when warranted.
- Start or update an
update_planchecklist for substantial work.
Gate: proceed only when the target, repo state, and applicable instructions are known or the missing inputs are documented.
Phase 1: Intake and Reproduction
- Fetch issue metadata with the GitHub app or:
gh issue view <id> --comments --json number,title,body,author,labels,state,comments,createdAt,updatedAt,url
- Read linked PRs, commits, logs, screenshots, discussions, and external docs referenced by the issue.
- If the input includes PR review feedback, refresh the live PR head or active branch before trusting any pasted claim.
- Extract observed behavior, expected behavior, exact errors, reproduction steps, environment, acceptance criteria, and ambiguities.
- Discover verification commands from actual repo files, not memory.
- Reproduce with the smallest faithful command or scenario.
- If direct reproduction is impossible, create or describe a minimal failing test/script/checklist that targets the reported behavior.
Gate: continue only when the issue is reproduced, a faithful failing regression exists, or non-reproducibility is documented with missing inputs.
Phase 2: Localization
Use references/localization-playbook.md.
- Build candidate locations from traces, failing tests, UI/API/CLI names, labels, linked PRs, and recent commits.
- Search with
rgand read every referenced file before making claims. - Follow call chains from entry point to failure and backward from failure to origin.
- Track hypotheses, evidence for/against, ruled-out paths, and files read.
- For each surviving candidate, write a one-paragraph bug-specific explanation of why that exact symbol/line could produce the symptom under the triggering conditions, and rank candidates by causal explanation strength plus direct evidence (trace/test agreement, data-flow reachability, recent diffs). Surface similarity ("this file looks related") is not a ranking.
- Stop only when root cause is localized to file, symbol, and line/condition, with broken contract and triggering conditions; do not propose a patch until the fault is justified at the line/condition level.
- For high-risk faults (security, isolation, IPC, auth, data integrity, concurrency) or when the top two candidates are close, run a second independent localization pass that does not read the first pass's conclusion, then reconcile before choosing.
Gate: at least two plausible hypotheses are considered unless the trace uniquely identifies the fault; each retained candidate has a written bug-specific explanation; selected root cause is localized to the line/condition level with direct code and command/test evidence.
Phase 3: Plan and Critic Review
Use references/critic-gate.md.
- Generate candidate fixes when realistic; for trivial defects include the selected fix and at least one rejected alternative.
- Rank by correctness, minimality, regression risk, API compatibility, architectural fit, testability, and rollback simplicity.
- Analyze callers/importers, config, docs, UI/API/CLI paths, persistence, concurrency, retry, cancellation, security, and privacy where relevant.
- Write a fix plan with exact files, functions, test plan, unwired-functionality checklist, risks, and rollback.
- Run an independent critic only when a separate subagent/delegation mechanism is available and the user/session has authorized it.
- If no independent critic is available, run the full fallback self-critic and label it exactly:
Fallback self-critic: independent critic unavailable. - Revise until critic blockers are resolved or explicitly escalated.
- High-risk or close-call fixes: draft 2-3 concrete candidate patches and choose between them by which makes the reproduction test pass while keeping the regression suite green and the diff minimal; on a tie, prefer the smallest contract-preserving patch and record why the alternatives were rejected. Select a patch on evidence, not first-draft intuition.
Gate: implementation may begin only when mode permits it and critic blockers are resolved. In plan-only or plan-then-approval, stop for user approval.
Phase 4: Implementation
- Re-check
git status --short. - Protect unrelated user changes. Do not revert or overwrite them.
- Write/update the failing regression first when feasible, and confirm it fails for the expected reason.
- Apply the minimal fix with
apply_patch. - Re-read changed files and verify all runtime entry points are wired.
- Run focused regression tests, impacted tests, and repo-required checks. For
opencode-swarm, use shell commands for repo validation; do not use broad OpenCodetest_runnerscopes. - When broad local suites are noisy, host-specific, or plausibly pre-existing, compare the failing path against a clean
origin/mainworktree and document the result. Use remote CI as the final cross-platform publish signal when local host behavior is not authoritative. - Record commands, exit codes, and meaningful output. Every "passed"/"validated" claim must cite the exact command and its captured output — never assert success you did not observe.
Gate: changed behavior matches the reviewed plan or the deviation is documented; regression protection exists or infeasibility is justified; impacted checks pass with commands and output recorded, or unrelated failures are proven on clean origin/main; a written correctness justification explains why the patch fixes the root cause and not merely the test (plausible != correct).
Phase 4.5: Independent Implementation Review
Have a fresh, independent context try to refute the implemented patch before it is presented as done. This challenges the actual diff and its evidence; it is distinct from the Phase 3 plan critic. The context that wrote the patch must not be the only one that approves it.
- If a separate subagent/delegation mechanism is available and authorized, run the reviewer with
references/critic-gate.md(Implementation Review section), given only the diff,08-test-results.md, and the trace artifacts. - Otherwise run the fallback adversarial self-review in a clean pass and label it:
Fallback self-review: independent reviewer unavailable. - The reviewer's mandate is adversarial: find a concrete input/environment/caller/sequence where the patch is wrong, incomplete, overfits the regression test, leaves a runtime path unwired, or regresses a contract — verifying against real code and captured output, not the implementer's narrative.
- Record the verdict (
APPROVE/NEEDS_REVISION/BLOCKED) and responses in08b-implementation-review.md; resolve every blocker with a code or evidence change, then re-review. For high-risk work (security, isolation, IPC, auth, payments, migrations, data integrity), this review is mandatory before closure, consistent with../commit-pr/SKILL.mdStep 9. - If subagent delegation is available and the user/session has authorized issue-tracer or swarm work, independent implementation review is mandatory for any code, test, docs, package metadata, release note, or skill-file edit. Fallback self-review is allowed only when no independent context is available, and that limitation must be disclosed.
- Any edit after reviewer approval invalidates that approval. Re-run the review on the latest diff and evidence.
Gate: 08b-implementation-review.md exists with a verdict; the review ran on the real diff and captured evidence; every blocker is resolved or explicitly escalated; reviewer unavailability is disclosed if it occurred; the latest edit happened before the latest reviewer approval.
Phase 4.6: Final Critic Gate
After implementation review approval, have a separate critic challenge the whole completion claim: current diff, validation evidence, implementation-review artifact, docs/release/package claims, and no-gap checklist.
- If subagent delegation is available, launch a critic with
references/critic-gate.md(Final Critic section), giving it the current diff,08-test-results.md,08b-implementation-review.md, and trace artifacts. - If no independent critic is available, run the fallback adversarial critic pass and label it
Fallback final critic: independent critic unavailable. - Write
09-final-critic.mdwith verdictAPPROVE,NEEDS_REVISION, orBLOCKED. - Resolve every
NEEDS_REVISION/BLOCKEDitem by changing code, docs, tests, or evidence, then re-run implementation review when the fix changes the diff and re-run final critic. - Any edit after final critic approval invalidates that approval.
Gate: 09-final-critic.md exists with verdict APPROVE; the critic reviewed the latest diff after implementation reviewer approval; every reviewer/critic blocker is resolved and re-reviewed; no edit occurred after the latest reviewer and critic approvals.
Phase 5: Closure
- Inspect
git diff --stat,git diff, andgit diff --check. - Verify no unrelated files changed.
- Draft PR text with
assets/pr-template.md. - Include root cause with file/line references, change summary, tests/checks, regression coverage, invariant audit evidence for touched areas, and residual risks.
- If this is review follow-up work, refresh the existing PR body and validation summary when pass counts, caveats, or invariant evidence changed.
- Publication is governed by the single source of truth,
.claude/skills/commit-pr/SKILL.md(via the.agents/skills/commit-pr/SKILL.mdadapter). When the user explicitly asks to commit, push, or open/update a PR, switch to that skill and follow it for the PR title, the PR body contract (Closes #,## Summary,## Invariant audit,## Test plan), the release fragment, the invariant audit, the issue comment, and CI closeout.assets/pr-template.mdis a drafting aid; the published PR must satisfy thecommit-prcontract, which thepr-standardscheck enforces.
No-Gap Checklist
- Reported symptom is reproduced or non-reproducibility is proven.
- Root cause is localized to exact code and triggering conditions.
- Fix addresses the root cause, not only the visible symptom.
- Every changed path is wired into the actual runtime path.
- Public API, CLI, UI, persistence, config, docs, and generated surfaces are checked where relevant.
- Positive, negative, boundary, and adversarial cases are covered or explicitly ruled out.
- Regression test fails before the fix and passes after the fix when feasible.
- Impacted tests and quality checks are run.
- Suspected pre-existing or host-specific failures are compared against clean
origin/mainor explicitly documented as unverified. - Plan critic review completed before approval or implementation gate.
- Independent implementation review (Phase 4.5) completed on the real diff and evidence; blockers resolved.
- Final critic review (Phase 4.6) approved the latest diff and evidence after implementation review.
- No edit occurred after the latest reviewer and critic approvals.
- A written correctness justification distinguishes "tests green" from "root cause fixed."
- Every "passed"/"validated" claim cites the exact command and its captured output.
- Publication (commit/push/PR) followed
.claude/skills/commit-pr/SKILL.md(the single source of truth). - PR-ready summary is complete.
Method Provenance (state of the art)
These methods are grounded in current agentic-repair and agent-reliability research: hierarchical file -> function -> line localization, multi-sample candidate patches, and validate-then-select repair (Agentless, https://arxiv.org/abs/2407.01489); reasoning-guided, explanation-ranked localization (RGFL, https://arxiv.org/pdf/2601.18044; AutoCodeRover, https://arxiv.org/abs/2404.05427); "tests passing is plausible, not correct" / patch overfitting (https://dl.acm.org/doi/10.1145/3702972); self-consistency across independent passes (https://arxiv.org/abs/2203.11171); a fresh independent context refutes the result and evidence-grounded reporting (Anthropic, https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents); plan -> implement -> review separation (Anthropic, https://www.anthropic.com/research/building-effective-agents).