name: pantheon-issue-resolve description: "Implementation + review + verify + fix loop. Receives an approved solution design via GitHub comment link, implements the fix, hunts P0/P1 bugs, triages findings, and iterates until no blocking issues remain."
Pantheon Issue Resolve
Overview
A rigorous implementation and review workflow that takes an approved solution design and drives it to a merge-ready PR. The workflow consists of:
- Implement → Execute the approved design, create/update PR
- Review → P0/P1 bug hunt + CI check on the changes
- Verify → Triage findings and scope decisions
- Fix Loop → Iterate until no in-scope blockers remain
This skill receives its design from pantheon-solution-design, which posts the final design as a GitHub issue comment.
Golden Rule
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
This rule applies to ALL prompts in this skill.
Prerequisites
Terminology
- Pantheon branch: A long-running sandbox environment (takes hours to complete)
- PR: GitHub Pull Request
- P0/P1: Critical/high-severity issues (see definitions below)
- In-scope: Issues that must be fixed in this PR before merge
- Deferred: Valid issues tracked in separate GitHub issues
- Design comment: The GitHub issue comment containing the approved solution design (posted by
pantheon-solution-design)
Waiting/Polling Mechanism
After every parallel_explore call, you must wait for completion:
Poll
functions.mcp__pantheon__get_branch(branch_id)until status is one of:- Terminal states:
failed,succeed,finished - Output-ready states:
manifesting,ready_for_manifest
- Terminal states:
When status is terminal or output-ready, call
functions.mcp__pantheon__branch_output(branch_id, full_output=true)to retrieve results.If status is not ready, sleep for 600 seconds (10 minutes), then poll again.
- Critical: Do NOT start overlapping sleeps or other tool calls during the wait
- No background sleeps (
sleep ... &) - Treat "Waiting for background terminal" messages as exclusive waits
Note:
manifestingandready_for_manifestmean the run is complete; you can fetch output and proceed (no need to wait forsucceed/finished).
Constraints
- Always use
num_branches=1withparallel_explore - One issue, one active exploration, one PR – Never start a second exploration while the first is running
- Pantheon branches are long-running (hours) – Don't treat them as quick tasks; be patient
- Read branch_output before any decision – Impatience creates duplicate PRs and wasted work
P0/P1 Standard (Evidence-Backed)
All severity claims must include code-causal evidence + reachability analysis + blast radius.
P0 (Critical): Reachable under default production config, and causes:
- Production unavailability, OR
- Severe data loss/corruption, OR
- Security vulnerability, OR
- Primary workflow completely blocked with no practical workaround
P1 (High): Reachable in realistic production scenarios (default or common configs), and:
- Significantly impairs core/major functionality, OR
- Violates user-facing contracts (correctness errors), OR
- Severe performance regression impacting usability
- Workaround may exist but is costly/risky/high-friction
Evidence bar: Borderline P1/P2 defaults to P1 unless impact is clearly narrow or edge-case only.
Inputs & Setup
Parse the task_description to extract:
Entry point (exactly ONE of):
issue_link: GitHub issue URL or identifier → setentry_mode = "new_issue"existing_pr_link: GitHub PR URL or number → setentry_mode = "existing_pr"
Design input:
design_comment_link: URL of the GitHub comment containing the final approved/consensus design (posted bypantheon-solution-design)
Required Pantheon context:
parent_branch_id: Starting Pantheon branch ID (sandbox baseline)
Pre-workflow setup:
If entry_mode = "existing_pr":
- Extract PR metadata:
gh pr view {existing_pr_link} --json number,url,headRefName,body - Set:
pr_number,pr_url,pr_head_branch - Try to extract issue link from PR body/title (look for "Fixes #123", "Closes https://...")
- If no issue found: set
issue_link = existing_pr_link
Initialize workflow variables:
# Entry tracking
entry_mode = "new_issue" or "existing_pr"
issue_link = <extracted from task>
existing_pr_link = <extracted or None>
design_comment_link = <extracted from task>
# Branch tracking
baseline_branch_id = parent_branch_id
last_fix_branch_id = None # Set after Step 1
# PR tracking (pre-filled if existing_pr, otherwise set after Step 1)
pr_number = <extracted or None>
pr_url = <extracted or None>
pr_head_branch = <extracted or None>
# Metrics
review_cycle_count = 0 # Increment each time Step 4.1 executes
Workflow Observability
The orchestrator (you, following this skill) must post a status comment on the PR (or issue, if PR doesn't exist yet) after every step completes. This ensures the workflow is fully traceable from the thread alone.
Status comment format:
gh pr comment {pr_number} --body-file - <<'STATUS_EOF'
<!-- pantheon-resolve-status:step-{step_name}:cycle-{review_cycle_count} -->
**Pantheon Resolve Workflow — {step_name}** (Cycle {review_cycle_count})
- **Branch ID**: `{branch_id}`
- **Agent**: {agent_type}
- **Outcome**: {outcome summary}
- **Next**: {what happens next}
STATUS_EOF
For Step 1 (before PR exists in new_issue mode): post on the issue instead, then switch to the PR for subsequent steps.
Workflow
Step 1: Implement Solution
Purpose: Read the approved design from the GitHub comment, then implement with expert workflow discipline.
Call: functions.mcp__pantheon__parallel_explore
agent:"claude_code"num_branches:1parent_branch_id:baseline_branch_id
Construct prompt based on entry_mode:
If entry_mode = "new_issue":
=== GOLDEN RULE ===
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
=== PHASE 1: READ THE APPROVED DESIGN ===
Fetch the approved solution design from the GitHub issue comment:
gh api "{design_comment_api_url}" --jq .body
(where {design_comment_api_url} is the API URL derived from {design_comment_link})
Read and internalize the full design — root cause, approach, files to change, edge cases, test strategy, and risks.
=== PHASE 2: SYNC CODE ===
Pull the latest code from master branch.
=== PHASE 3: IMPLEMENT ===
WORKING METHOD (CRITICAL):
Do not rush to write code.
First, make sure you fully understand the issue and the approved design.
Then, design the best implementation strategy based on the solution design.
You are a Linus bigfan: use KISS (accurate, rigorous, concise) to find the best simple, robust, minimal-change strategy.
Only after the strategy is clear should you implement.
If new constraints appear during implementation, pause and re-evaluate the strategy before continuing.
Implement the solution following the approved design:
1. Follow the expert method above and the approved solution design
2. Apply KISS principle: accurate, rigorous, and concise
3. Self-review your diff (correctness, edge cases, compatibility)
4. Run the smallest relevant tests/build to verify basic functionality
5. Create a NEW PR:
- Use `gh pr create`
- Include a clear title and description referencing {issue_link}
6. Handle GitHub CLI auth:
- If `gh` is unauthorized (token expired/invalid), retry once
- If still unauthorized: commit locally, keep the branch, and output GH_AUTH_EXPIRED mode
=== OUTPUT ===
Output exactly ONE of the following modes:
--- Success mode ---
IMPLEMENTATION_SUCCESS
PR_URL=<url>
PR_NUMBER=<number>
PR_HEAD_BRANCH=<branch>
--- GH auth expired mode ---
GH_AUTH_EXPIRED
LOCAL_COMMIT=<sha>
RETRY_PUSH_BRANCH=<branch>
--- Implementation blocked mode (use ONLY if design is fundamentally flawed) ---
IMPLEMENTATION_BLOCKED
REASON=<why the design doesn't work; requires re-design>
If entry_mode = "existing_pr":
=== GOLDEN RULE ===
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
=== PHASE 1: READ THE APPROVED DESIGN ===
Fetch the approved solution design from the GitHub issue comment:
gh api "{design_comment_api_url}" --jq .body
Read and internalize the full design — root cause, approach, files to change, edge cases, test strategy, and risks.
=== PHASE 2: SYNC CODE ===
Checkout the existing PR branch: {pr_head_branch}
=== PHASE 3: IMPLEMENT ===
WORKING METHOD (CRITICAL):
Do not rush to write code.
First, understand the current PR and the approved design for improvements.
Then, design the best implementation strategy.
You are a Linus bigfan: use KISS (accurate, rigorous, concise) to find the best simple, robust path.
Only after the strategy is clear should you implement.
If new constraints appear during implementation, pause and re-evaluate the strategy before continuing.
Implement the improvements following the approved design:
1. Follow the expert method above and the approved solution design
2. Apply KISS principle: accurate, rigorous, and concise
3. Self-review your diff (correctness, edge cases, compatibility)
4. Run the smallest relevant tests/build to verify basic functionality
5. PR Management:
- Checkout existing PR branch: `gh pr checkout {pr_number}` (or `git checkout {pr_head_branch}`)
- Commit your changes
- Push to existing branch: `git push`
6. Handle GitHub CLI auth:
- If `gh` is unauthorized (token expired/invalid), retry once
- If still unauthorized: commit locally, keep the branch, and output GH_AUTH_EXPIRED mode
=== OUTPUT ===
Output exactly ONE of the following modes:
--- Success mode ---
IMPLEMENTATION_SUCCESS
PR_URL=<url>
PR_NUMBER=<number>
PR_HEAD_BRANCH=<branch>
--- GH auth expired mode ---
GH_AUTH_EXPIRED
LOCAL_COMMIT=<sha>
RETRY_PUSH_BRANCH={pr_head_branch}
--- Implementation blocked mode (use ONLY if design is fundamentally flawed) ---
IMPLEMENTATION_BLOCKED
REASON=<why the design doesn't work; requires re-design>
Wait for completion, then:
Parse output
Set
last_fix_branch_id = <branch_id from Step 1>Post orchestrator status comment (on issue if
new_issueand no PR yet, on PR if available):- Step:
implement - Branch ID:
last_fix_branch_id - Agent:
claude_code - Outcome:
IMPLEMENTATION_SUCCESS/GH_AUTH_EXPIRED/IMPLEMENTATION_BLOCKED - Next: what follows
- Step:
If
IMPLEMENTATION_SUCCESS:- Extract and store
pr_url,pr_number,pr_head_branch - Proceed to Step 2
- Extract and store
If
GH_AUTH_EXPIRED:- Start ONE recovery exploration from
last_fix_branch_id:Do NOT change code. Use existing local commits only. Push branch {retry_push_branch} and create/update PR using gh. If a PR for this branch already exists, reuse it. Output: IMPLEMENTATION_SUCCESS PR_URL=<url> PR_NUMBER=<number> PR_HEAD_BRANCH=<branch> - Wait for completion, extract PR info, proceed to Step 2
- Start ONE recovery exploration from
If
IMPLEMENTATION_BLOCKED:- Log the reason
- Output to orchestrator:
IMPLEMENTATION_BLOCKED REASON=<reason> - The orchestrator should re-run the design skill with the blocker as additional context
Step 2: Review (P0/P1 Bug Hunt + CI Check)
Purpose: Rigorously review the PR changes for critical issues and verify CI status.
Call: functions.mcp__pantheon__parallel_explore
agent:"codex"num_branches:1parent_branch_id:last_fix_branch_id
Prompt:
=== GOLDEN RULE ===
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
Review PR #{pr_number} (related to: {issue_link}) with scientific rigor.
=== REVIEW PRINCIPLES ===
1. Treat this like a scientific investigation: read as much as needed, explain what the code actually does (don't guess)
2. Only accept a P0/P1 finding when code evidence + reachability justify it
3. Default stance: assume no critical issues unless proven otherwise
=== REVIEW CHECKLIST ===
1. Correctness: Does the fix address the root cause? Are there logical errors?
2. Edge cases: Are boundary conditions handled? What about error paths?
3. Compatibility: Does this break existing behavior or APIs?
4. Regressions: Could this introduce new bugs in other code paths?
5. Performance: Are there performance implications?
6. Security: Are there security vulnerabilities (injection, XSS, auth bypass, etc.)?
7. Testing: If the original issue was a failing test, rerun that test on the current PR HEAD to verify the fix
=== CI STATUS CHECK ===
Check the CI status of the PR:
gh pr checks {pr_number} --watch --fail-fast
If CI is still running, wait for it to complete. Report any CI failures as findings:
- If a CI failure is caused by the PR changes: report as P0/P1 finding with evidence
- If a CI failure is a known flaky test or pre-existing failure: note it but do NOT report as P0/P1
=== OUTPUT ===
If you find any P0 or P1 issues (including CI failures caused by PR changes), output:
P0_P1_FINDINGS
BEGIN_P0_P1_FINDINGS
<list each finding with:>
- Severity: P0 or P1
- Description: What is the issue?
- Code evidence: Exact file:line and why it's a problem
- Reachability: How is this triggered? (default config / common config / edge case)
- Blast radius: What is the impact? (availability / correctness / data / security / performance)
END_P0_P1_FINDINGS
If no P0/P1 issues found, output exactly:
NO_P0_P1
IMPORTANT: Do NOT post PR comments or create GitHub issues in this step.
Wait for completion, then:
- Parse output
- Do NOT update
last_fix_branch_id(Review is read-only) - Post orchestrator status comment on the PR:
- Step:
review - Branch ID: the review branch
- Agent:
codex - Outcome:
NO_P0_P1orP0_P1_FINDINGS(with count of findings) - CI status: pass/fail/not-checked
- Next: what follows
- Step:
- If
NO_P0_P1:- Post workflow completion comment (see "Workflow Completion Comment" below)
- Workflow complete (no blocking issues found)
- If
P0_P1_FINDINGS: Extract findings, proceed to Step 3
Step 3: Verify (Triage & Scope)
Purpose: Validate findings and decide what must be fixed in this PR vs deferred.
Call: functions.mcp__pantheon__parallel_explore
agent:"codex"num_branches:1parent_branch_id:last_fix_branch_id
Prompt:
=== GOLDEN RULE ===
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
Verify the P0/P1 findings from the review for PR #{pr_number}.
=== INPUT: REVIEW FINDINGS ===
{paste entire content between BEGIN_P0_P1_FINDINGS and END_P0_P1_FINDINGS from Step 2}
=== VERIFICATION PRINCIPLES ===
Default stance: Each finding may be a misread, misunderstanding, or edge case—unless code evidence forces you to accept it.
Read as much as needed. Treat analysis like a scientific experiment: explain what the code does (don't guess), challenge assumptions, confront gaps in understanding.
=== TRIAGE PROCESS ===
For EACH finding, perform the following triage:
1. **Validity**: Confirm it is real on current PR HEAD
- Is the issue actually present in the code?
- Is it reachable in realistic scenarios?
- Is the severity assessment correct?
2. **Origin**: Best-effort determination
- Introduced by this PR? (check git diff)
- Pre-existing on master? (check base branch)
3. **Fix Difficulty & Risk**:
- Difficulty: S (small) / M (medium) / L (large)
- Risk: low / med / high
4. **Scope Decision** (choose exactly ONE per finding):
- **FIX_IN_THIS_PR**: Must be fixed before merge
- Issue is introduced by this PR, OR
- Merging makes things worse, OR
- P0/P1 that must be addressed now
- **DEFER_CREATE_ISSUE**: Valid but does NOT block this PR
- Not introduced by this PR and merge doesn't worsen it, OR
- Fix is large/risky and better handled separately
- **INVALID_OR_ALREADY_FIXED**: Not a real issue
- Not valid, not reachable, not actually P0/P1, OR
- Already fixed by current HEAD, OR
- Duplicate of existing issue
=== GITHUB ISSUE CREATION ===
For every finding marked DEFER_CREATE_ISSUE:
1. Extract repo name:
REPO=$(gh pr view {pr_number} --json baseRepository --jq .baseRepository.nameWithOwner)
2. Search for existing issues (avoid duplicates):
gh issue list -R "$REPO" --search "<keywords> in:title,body state:open" --limit 10
3. If matching open issue exists:
- Do NOT create new issue
- Optionally add comment with new evidence + link to PR #{pr_number}
- Use existing issue URL
4. If no match, create new issue:
gh issue create -R "$REPO" --title "<title>" --body "<body>"
Body must include: code evidence, repro steps, impact, link to PR #{pr_number}
=== PR COMMENT (IDEMPOTENT) ===
Post ONE summary comment on the PR (idempotent per PR HEAD SHA):
1. Get PR HEAD SHA:
HEAD_SHA=$(gh pr view {pr_number} --json headRefOid --jq .headRefOid)
2. Check if comment already exists:
- Search for existing comment containing: `<!-- pantheon-verify:$HEAD_SHA -->`
- If found, do NOT post again (comment is idempotent per SHA)
3. Post comment via stdin (preserves formatting):
gh pr comment {pr_number} --body-file - <<'EOF'
<!-- pantheon-verify:$HEAD_SHA -->
## Review Verification Summary
### FIX_IN_THIS_PR (blocking merge)
<list each item with: severity, brief rationale, difficulty/risk>
### DEFER_CREATE_ISSUE (tracked separately)
<list each item with: issue link, brief rationale>
### INVALID_OR_ALREADY_FIXED
<brief rationale for each>
EOF
=== OUTPUT ===
If NO findings are marked FIX_IN_THIS_PR, output:
NO_IN_SCOPE_P0_P1
Otherwise, output:
IN_SCOPE_P0_P1
BEGIN_IN_SCOPE_P0_P1
<list only the findings marked FIX_IN_THIS_PR, with full details: severity, description, code evidence, fix guidance>
END_IN_SCOPE_P0_P1
Wait for completion, then:
- Parse output
- Do NOT update
last_fix_branch_id(Verify is read-only) - Post orchestrator status comment on the PR:
- Step:
verify - Branch ID: the verify branch
- Agent:
codex - Outcome:
NO_IN_SCOPE_P0_P1orIN_SCOPE_P0_P1(with triage breakdown: N fix / N defer / N invalid) - Next: what follows
- Step:
- If
NO_IN_SCOPE_P0_P1:- Post workflow completion comment (see "Workflow Completion Comment" below)
- Workflow complete (all issues are deferred or invalid)
- If
IN_SCOPE_P0_P1: Extract in-scope issues, proceed to Step 4
Step 4: Fix Loop (Iterate Until Clean)
Purpose: Fix in-scope P0/P1 issues, then re-review until clean.
Loop structure:
WHILE in-scope P0/P1 issues exist:
1. Fix the issues
2. Review again (Step 2)
3. If NO_P0_P1: break
4. Verify again (Step 3)
5. If NO_IN_SCOPE_P0_P1: break
END WHILE
4.1: Fix In-Scope Issues
Call: functions.mcp__pantheon__parallel_explore
agent:"claude_code"num_branches:1parent_branch_id:last_fix_branch_id
Prompt:
=== GOLDEN RULE ===
Don't patch, don't workaround, don't add legacy/fallback code. Design a clean architecture. Solve the task thoroughly.
Fix the following in-scope P0/P1 issues for PR #{pr_number}:
=== ISSUES TO FIX ===
{paste entire content between BEGIN_IN_SCOPE_P0_P1 and END_IN_SCOPE_P0_P1 from Step 3}
=== REQUIREMENTS ===
1. Do not rush into code changes
2. First understand each finding and the affected code paths
3. Then design the best fix strategy before editing: you are a Linus bigfan, and use KISS (accurate, rigorous, concise) to find the best simple, robust path
4. If new constraints are discovered, pause and re-evaluate the strategy before continuing
5. Fix each issue using KISS principle: accurate, rigorous, concise
6. Do NOT introduce new bugs or regressions
7. Self-review your changes
8. Run smallest relevant tests/build
9. PR Management:
- Do NOT create a new PR
- Checkout existing PR branch: `gh pr checkout {pr_number}` (or `git checkout {pr_head_branch}`)
- Commit your fixes
- Push to existing branch: `git push`
10. Handle GitHub CLI auth:
- If `gh` unauthorized: retry once
- If still unauthorized: commit locally, output GH_AUTH_EXPIRED mode
=== OUTPUT ===
Success mode:
FIX_SUCCESS
GH auth expired mode:
GH_AUTH_EXPIRED
LOCAL_COMMIT=<sha>
RETRY_PUSH_BRANCH={pr_head_branch}
Wait for completion, then:
Set
last_fix_branch_id = <branch_id from this Fix run>Increment
review_cycle_count += 1Post orchestrator status comment on the PR:
- Step:
fix - Branch ID:
last_fix_branch_id - Agent:
claude_code - Cycle:
{review_cycle_count} - Outcome:
FIX_SUCCESSorGH_AUTH_EXPIRED - Next: re-review
- Step:
If
GH_AUTH_EXPIRED:- Start recovery exploration from
last_fix_branch_id:Do NOT change code. Use existing local commits. Push {pr_head_branch} and sync PR using gh. Output: FIX_SUCCESS - Wait for completion
- Start recovery exploration from
Proceed to 4.2
4.2: Re-Review
Repeat Step 2 (Review) with parent_branch_id = last_fix_branch_id.
Wait and parse output:
- If
NO_P0_P1:- Post workflow completion comment (see "Workflow Completion Comment" below)
- Exit loop, workflow complete
- If
P0_P1_FINDINGS: Proceed to 4.3
4.3: Re-Verify
Repeat Step 3 (Verify) with parent_branch_id = last_fix_branch_id.
Wait and parse output:
- If
NO_IN_SCOPE_P0_P1:- Post workflow completion comment (see "Workflow Completion Comment" below)
- Exit loop, workflow complete
- If
IN_SCOPE_P0_P1: Extract issues, go back to 4.1 (Fix again)
Loop termination: Exit when either Review finds no P0/P1, or Verify finds no in-scope P0/P1.
Workflow Completion Comment
When the workflow completes (no in-scope P0/P1 issues remain), post a completion comment on the PR:
gh pr comment {pr_number} --body "$(cat <<'EOF'
## Pantheon Issue Resolution Complete
This PR has been analyzed and iterated through the Fix/Review/Verify loop until no in-scope P0/P1 blockers remain.
**Final Status:**
- No P0/P1 blocking issues found in latest review
- All identified issues have been either fixed or deferred to separate issues
- PR is ready for final human review and merge
**Workflow Summary:**
- Issue analyzed: {issue_link}
- PR created/updated: #{pr_number}
- Review cycles completed: {review_cycle_count}
---
Automated by [pantheon-issue-resolve](https://github.com/pingcap-inc/pantheon-agents/tree/main/skills/pantheon-issue-resolve)
EOF
)"
Replace placeholders with actual variable values:
{issue_link}→ value ofissue_linkvariable{pr_number}→ value ofpr_numbervariable{review_cycle_count}→ value ofreview_cycle_countvariable
Summary: Information Flow
Step 1 (Implement) ← reads design from design_comment_link
↓ outputs: PR info (pr_number, pr_url, pr_head_branch)
Step 2 (Review + CI Check) ← knows PR info
↓ outputs: P0_P1_FINDINGS (or NO_P0_P1)
Step 3 (Verify) ← receives P0_P1_FINDINGS
↓ outputs: IN_SCOPE_P0_P1 (or NO_IN_SCOPE_P0_P1)
Step 4 (Fix Loop) ← receives IN_SCOPE_P0_P1
↓ iterates: Fix → Review → Verify
↓ workflow complete when no in-scope P0/P1 remain
Recovery & Error Handling
Pantheon Exploration Failures
If a Pantheon branch fails (status = failed):
- Check
branch_outputfor error details - Determine if error is transient (network, resource) or permanent (code crash)
- If transient: retry the same step from the same
parent_branch_id - If permanent: investigate root cause, fix if possible, or escalate to user
GitHub Auth Failures
Handled automatically via GH_AUTH_EXPIRED output mode and recovery explorations (see Step 1 and Step 4.1).
Implementation Blocked
If Step 1 outputs IMPLEMENTATION_BLOCKED:
- The solution design is fundamentally flawed
- The orchestrator (AGENTS.md) should re-run
pantheon-solution-designwith the blocker reason as additional context
Notes
- Agent selection:
- Steps 1, 4.1 (Implementation, Fix): Use
agent="claude_code"for coding tasks - Steps 2, 3 (Review, Verify): Use
agent="codex"for analytical/review tasks
- Steps 1, 4.1 (Implementation, Fix): Use
- Design input: The agent reads the design from
design_comment_linkusinggh api— GitHub is the work ledger, not prompt blobs - Sequential explorations: This workflow uses sequential explorations by design (each depends on previous results)
- Extensibility: This workflow can be extended to support refactoring, performance optimization, or feature development (not just bug fixes)