diagnose

star 1

Root cause analysis for a Quay bug. Traces code paths through Quay subsystems, analyzes git history, forms and tests hypotheses, and enumerates complete state space.

quay By quay schedule Updated 5/8/2026

name: diagnose description: > Root cause analysis for a Quay bug. Traces code paths through Quay subsystems, analyzes git history, forms and tests hypotheses, and enumerates complete state space. allowed-tools: - Bash(git log *) - Bash(git blame *) - Bash(git show *) - Bash(gh pr view *) - Read - Glob - Grep - Agent - AskUserQuestion

Diagnose Root Cause

Determine why the bug exists. This is the most critical phase — a wrong diagnosis leads to a wrong fix.

Your Role

Perform thorough root cause analysis that provides clear, evidence-based conclusions. Identify the underlying issue, not just the symptom.

Process

Step 1: Review Reproduction

Read the reproduction report (artifacts/quay-bugfix/reports/reproduction.md if it exists):

  • Understand the exact conditions that trigger the bug
  • Note any patterns or edge cases discovered
  • Identify the entry point for investigation

Step 2: Trace the Code Path

Starting from the reproduction steps, trace the execution flow through Quay's subsystems. Use the subsystem map from CLAUDE.md:

For API bugs: Entry point in endpoints/ -> business logic in data/model/ -> database operations in data/database.py -> query execution

For worker bugs: Worker class in workers/ -> queue operations -> data layer -> database

For UI bugs: React component in web/src/ -> API hook in web/src/hooks/ -> REST endpoint in endpoints/

For auth bugs: Auth middleware in auth/ -> permission checks -> data layer

For storage bugs: Storage backend in storage/ -> blob operations -> configuration

Use file:line notation for every code reference (e.g., endpoints/api/repository.py:245).

Step 3: Historical Analysis

git log --oneline -20 -- <affected-files>
git blame <file> -L <start>,<end>

Look for: recent changes that introduced the bug, related PRs, patterns of similar fixes.

Step 4: Hypothesis Formation

List all potential root causes based on evidence:

  • Rank hypotheses by likelihood (high/medium/low confidence)
  • Consider: logic errors, race conditions, edge cases, missing validation, incorrect state transitions, ORM query issues
  • Document reasoning for each hypothesis

Step 5: Hypothesis Testing

  • Identify where targeted logging or debugging would confirm/refute each hypothesis (note locations for the fix phase)
  • Design minimal test cases that would validate or disprove each hypothesis (document them for the test phase)
  • Use binary search (git bisect) if the change was introduced gradually
  • Narrow down to the definitive root cause

Step 6: State Enumeration

CRITICAL: If the bug involves state-dependent logic (status fields, phase transitions, feature flags, configuration options):

  • Search the codebase for the complete list of possible values
  • Don't assume you know all states — verify by searching
  • Document feature interactions that affect the bug

Example: If a worker stops processing on "terminal" statuses, search for ALL statuses used in the codebase, not just the ones in the bug report.

Step 7: Impact Assessment

  • What other code paths are affected by the same root cause?
  • Could the fix cause regressions in related functionality?
  • Are there similar patterns elsewhere that have the same bug?
  • Does this affect database migrations or schema?

Step 8: Solution Approach

  • Recommend fix strategy based on root cause
  • Consider multiple approaches and their trade-offs
  • Document why the recommended approach is best

Output

Save to artifacts/quay-bugfix/analysis/root-cause.md:

# Root Cause Analysis: <TICKET>

## Root Cause
<Clear, specific explanation of why the bug exists>

## Code References
- `file:line` — <what this code does wrong>
- ...

## Evidence
- <supporting evidence from code, git history, reproduction>

## Timeline
<When the bug was introduced — commit/PR reference>

## Impact
- **Scope:** <what else is affected>
- **Risk:** <could the fix cause regressions>
- **Related patterns:** <similar code that may have the same issue>

## States/Conditions Enumerated
<Complete list of states/values found by searching, not assumed>

## Recommended Fix
<Specific approach with code locations to change>

## Alternative Approaches
<Other solutions with pros/cons>

Confidence Assessment

Include this section at the end of your root cause artifact:

## Confidence Assessment
- **Level**: high | medium | low
- **Score**: <0-100 integer>
- **Score rationale**: <1-2 sentences>
- **Open questions**: <bullet list, or "None">

Confidence signals for this phase:

  • High: Single root cause identified with strong code evidence and git history
  • Medium: Primary root cause identified but alternative explanations not fully ruled out
  • Low: Multiple plausible causes, insufficient evidence to distinguish, or root cause in unfamiliar subsystem

When This Phase Is Done

Report: the identified root cause, confidence level, and where the analysis was written.

Install via CLI
npx skills add https://github.com/quay/ai-helpers --skill diagnose
Repository Details
star Stars 1
call_split Forks 8
navigation Branch main
article Path SKILL.md
More from Creator