phantom-protocol - SKILL.md Agent Skill

name: phantom-protocol description: | PHANTOM v6: Ultimate cognitive architecture for debugging, code review, generation, and self-analysis. Combines Abductive Fault Inversion, Spectral Execution Tracing (Ghost/Demon dual traces), Dialectical Assumption Collapse, Cognitive Immune System (CIS) with 16 antibodies, Intelligence Amplification Framework (IAF) with 7 enhancement methods, Cognitive Capability Activation (CCA), 7-Level Architecture (ARCH), and Mem0 as primary unlimited memory system. Features 58 capabilities, monitoring checkpoints, persona verification (Constructor/Destroyer/Defender/Judge), calibrated confidence, Responsive Generator Model (RGM), GRIMOIRE spell format, and cross-conversation persistence via Mem0. Activates for: debugging, code review, "why isn't this working", architecture design, high-stakes reasoning, ULTRATHINK sessions, or understanding code/cognitive behavior.

THE PHANTOM PROTOCOL v6

Spectral Tracing × Abductive Inversion × Dialectical Collapse × Cognitive Immune System × Intelligence Amplification × Mem0

A cognitive architecture that hunts bugs like ghosts: triangulating their position through symptoms, manifesting their true nature through dual execution traces, and exorcising them by collapsing the false assumptions that invited them in.

CORE PHILOSOPHY

The Triad of Fault Discovery:

Component	Core Question	Method
ABDUCTIVE ENGINE	"What would cause these exact symptoms?"	Work backwards from effects to possible causes, maintaining competing hypotheses
SPECTRAL TRACER	"Where does intended behavior diverge from actual?"	Simulate Ghost (what should happen) and Demon (what does happen) traces in parallel
DIALECTICAL INQUISITOR	"Which hidden assumption is the traitor?"	Systematically invert every assumption until one cracks

The Meta-Insight: Every bug exists because reality diverged from expectation. The divergence point is the bug. But you can't find the divergence if you never made your expectations explicit. PHANTOM forces explicit expectation articulation, then hunts for where reality betrays it.

THE EXECUTION ARCHITECTURE

                                ┌─────────────────────┐
                                │  TASK INTAKE        │
                                │  Debug/Review/Gen   │
                                └──────────┬──────────┘
                                           │
                        ┌──────────────────┼──────────────────┐
                        ▼                  ▼                  ▼
              ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
              │ DEBUGGING PATH  │ │ REVIEW PATH     │ │ GENERATION PATH │
              │ (Phases 1-7)    │ │ (Phases R1-R5)  │ │ (Phases G1-G5)  │
              └────────┬────────┘ └────────┬────────┘ └────────┬────────┘
                       │                   │                   │
                       └──────────────────┬┴───────────────────┘
                                          ▼
                              ┌─────────────────────┐
                              │  CONSECRATION       │
                              │  (Universal Phase)  │
                              └──────────┬──────────┘
                                         ▼
                              ┌─────────────────────┐
                              │  GRIMOIRE           │
                              │  (Lessons Captured) │
                              └─────────────────────┘

PART I: THE DEBUGGING PATH

Phase 1: MANIFESTATION (Symptom Documentation)

Before hunting, know what you're hunting. Document the haunting precisely.

MANIFESTATION PROTOCOL:
┌────────────────────────────────────────────────────────────────────────┐
│ 1. SYMPTOM CAPTURE                                                     │
│    ├─ What EXACTLY is the observed behavior?                           │
│    ├─ What EXACTLY is the expected behavior?                           │
│    ├─ What is the DELTA (specific difference)?                         │
│    └─ Is the symptom deterministic or intermittent?                    │
├────────────────────────────────────────────────────────────────────────┤
│ 2. REPRODUCTION SIGNATURE                                              │
│    ├─ Minimal reproduction steps (fewest actions to trigger)           │
│    ├─ Environment specifics (versions, configs, state)                 │
│    ├─ Frequency: Always / Sometimes / Rare / Once                      │
│    └─ Variations tried: What makes it better/worse/different?          │
├────────────────────────────────────────────────────────────────────────┤
│ 3. BLAST RADIUS ASSESSMENT                                             │
│    ├─ What functionality IS working correctly?                         │
│    ├─ What functionality is DEFINITELY broken?                         │
│    ├─ What functionality is POSSIBLY affected (uncertain)?             │
│    └─ Draw the boundary around the haunted zone                        │
└────────────────────────────────────────────────────────────────────────┘

Output: A precise Symptom Profile that constrains where the bug can live.

Phase 2: DIVINATION (Abductive Hypothesis Generation)

Work BACKWARDS from symptoms to causes. Generate multiple competing hypotheses.

DIVINATION PROTOCOL:
┌────────────────────────────────────────────────────────────────────────┐
│ ABDUCTIVE REASONING ENGINE                                             │
│                                                                        │
│ For each symptom, ask: "What conditions would PRODUCE this exact      │
│ symptom?" Not "what might be wrong" but "what would CREATE this?"      │
│                                                                        │
│ Generate 3-5 COMPETING HYPOTHESES:                                     │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ H1: [Hypothesis]                                                   │ │
│ │     Explains symptoms: [which ones, how]                           │ │
│ │     Doesn't explain: [gaps]                                        │ │
│ │     Would also predict: [other symptoms we could look for]         │ │
│ │     Probability: [1-10]                                            │ │
│ │     Test to confirm/refute: [specific experiment]                  │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ (Repeat for H2, H3, H4, H5)                                            │
│                                                                        │
│ CRITICAL: Hypotheses must be MUTUALLY DISTINGUISHABLE.                 │
│ If H1 and H2 would look identical in all tests, merge them.            │
└────────────────────────────────────────────────────────────────────────┘

Hypothesis Categories (ensure diversity):

Category	Example
State Corruption	Variable modified unexpectedly between A and B
Timing/Race	Operation completes before/after expected
Input Violation	Data doesn't match assumed schema
Logic Error	Algorithm produces wrong result for edge case
Environment Mismatch	Works locally, fails in prod (config, version)
Resource Exhaustion	Memory, connections, file handles depleted
Integration Failure	External service behaves unexpectedly
Assumption Violation	Code assumes X, reality is ¬X

Output: Ranked hypothesis list with distinguishing tests for each.

Phase 3: SUMMONING (Spectral Execution Tracing)

Now summon both ghosts: what SHOULD happen vs what DOES happen.

SUMMONING PROTOCOL: DUAL TRACE EXECUTION
┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  THE GHOST TRACE (Intended Execution)                                  │
│  ═══════════════════════════════════                                   │
│  Trace what the code SHOULD do, step by step:                          │
│                                                                        │
│  Step 1: [Function called with inputs X, Y]                            │
│          Expected state after: [describe]                              │
│  Step 2: [Next operation]                                              │
│          Expected state after: [describe]                              │
│  ...continue until expected output...                                  │
│                                                                        │
│  Rules for Ghost Trace:                                                │
│  - Derive from REQUIREMENTS/SPEC, not from reading code                │
│  - If requirements unclear, make assumptions EXPLICIT                  │
│  - Be specific: "list contains [1,2,3]" not "list has items"           │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  THE DEMON TRACE (Actual Execution)                                    │
│  ═══════════════════════════════════                                   │
│  Trace what the code ACTUALLY does, step by step:                      │
│                                                                        │
│  Step 1: [Function called with inputs X, Y]                            │
│          Actual code path: [line numbers, branches taken]              │
│          Actual state after: [observed/derived values]                 │
│  Step 2: [Next operation]                                              │
│          Actual code path: [line numbers, branches taken]              │
│          Actual state after: [observed/derived values]                 │
│  ...continue until actual output...                                    │
│                                                                        │
│  Rules for Demon Trace:                                                │
│  - Read code LINE BY LINE, no assumptions                              │
│  - Track actual variable values, not intended values                   │
│  - Follow the REAL control flow (which branch, which loop iteration)   │
│  - Note any implicit type coercions, default values                    │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

DIVERGENCE DETECTION:

┌────────────────────────────────────────────────────────────────────────┐
│  DIVERGENCE ANALYSIS                                                   │
│  Compare Ghost and Demon traces step-by-step:                          │
│                                                                        │
│  Step N: Ghost says [X], Demon says [Y]                                │
│          DIVERGENCE DETECTED: [description of difference]              │
│          First divergence at step: [N]                                 │
│          Root cause likely between step [N-1] and [N]                  │
│                                                                        │
│  If no divergence found but bug exists:                                │
│  → Ghost trace was wrong (requirements misunderstood)                  │
│  → Demon trace was wrong (code misread)                                │
│  → Bug is in a path not traced (missing edge case)                     │
│  ACTION: Extend traces, recheck both                                   │
└────────────────────────────────────────────────────────────────────────┘

POSSESSION CHECK (Meta-Validation):

Before trusting your traces, verify your mental model isn't corrupted:

POSSESSION CHECK QUESTIONS:
1. Am I READING the code or REMEMBERING what I think it does?
2. Did I actually trace the FAILING path or a path that works?
3. Are my "expected values" from the spec or from the (buggy) code?
4. Did I trace through ACTUAL data or "typical" data?
5. Am I assuming library functions work as expected? (Verify!)

Output: Pinpointed divergence location, or evidence traces need refinement.

Phase 4: INQUISITION (Dialectical Assumption Collapse)

Every bug you can't find hides behind a false assumption you don't know you're making.

INQUISITION PROTOCOL: ASSUMPTION INVERSION
┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  STEP 1: ASSUMPTION INVENTORY                                          │
│  List EVERY assumption you're making, even "obvious" ones:             │
│                                                                        │
│  ┌─────┬──────────────────────────────────┬────────────┬─────────────┐ │
│  │ ID  │ Assumption                       │ Confidence │ Evidence    │ │
│  ├─────┼──────────────────────────────────┼────────────┼─────────────┤ │
│  │ A1  │ Function X is being called       │ HIGH       │ Logging     │ │
│  │ A2  │ Input is always an array         │ MEDIUM     │ None        │ │
│  │ A3  │ Database connection is open      │ HIGH       │ Try/catch   │ │
│  │ A4  │ This loop terminates             │ ASSUMED    │ None        │ │
│  │ A5  │ Timestamps are in UTC            │ ASSUMED    │ None        │ │
│  │ A6  │ Third-party API returns JSON     │ MEDIUM     │ Docs        │ │
│  └─────┴──────────────────────────────────┴────────────┴─────────────┘ │
│                                                                        │
│  Assumption sources to audit:                                          │
│  - Input types, ranges, and formats                                    │
│  - Function preconditions (what must be true before call)              │
│  - Function postconditions (what's guaranteed after)                   │
│  - Loop invariants and termination conditions                          │
│  - Concurrency: ordering, atomicity, visibility                        │
│  - External dependencies: availability, behavior, versions             │
│  - Environment: configs, permissions, resources                        │
│  - Data: schema, nullability, encoding, timezone                       │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  STEP 2: INVERSION TESTING                                             │
│  For each assumption, ask: "What if the OPPOSITE is true?"             │
│                                                                        │
│  A1: "Function X is being called"                                      │
│      INVERT: "What if function X is NOT being called?"                 │
│      Would this explain symptoms? [yes/no/partially]                   │
│      How to test inversion: [add log at function entry]                │
│                                                                        │
│  A4: "This loop terminates"                                            │
│      INVERT: "What if this loop runs forever?"                         │
│      Would this explain symptoms? [yes - would cause timeout]          │
│      How to test inversion: [add iteration counter, check if >1000]    │
│                                                                        │
│  PRIORITY ORDER for inversions:                                        │
│  1. Assumptions with ASSUMED/LOW confidence (never verified)           │
│  2. Assumptions that would explain multiple symptoms                   │
│  3. Assumptions about data (high variance in real world)               │
│  4. Assumptions about timing/concurrency (hardest to reason about)     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  STEP 3: COLLAPSE DETECTION                                            │
│  An assumption COLLAPSES when inversion testing reveals it's false.    │
│                                                                        │
│  Collapse found:                                                       │
│  ├─ Assumption A[N]: [what you assumed]                                │
│  ├─ Reality: [what's actually true]                                    │
│  ├─ Why you assumed wrong: [what misled you]                           │
│  ├─ Code that depends on false assumption: [locations]                 │
│  └─ Fix strategy: [how to handle reality]                              │
│                                                                        │
│  No collapse found after exhaustive inversion?                         │
│  → Your assumption inventory is incomplete                             │
│  → Go deeper: assumptions about assumptions                            │
│  → Check: Are you assuming your tests are correct?                     │
│  → Check: Are you assuming your tools are working?                     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Output: Identified false assumption(s) that enabled the bug.

Phase 5: TRIANGULATION (Hypothesis Convergence)

Synthesize findings from all three engines to pinpoint the root cause.

TRIANGULATION PROTOCOL:
┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  EVIDENCE SYNTHESIS:                                                   │
│                                                                        │
│  From DIVINATION (Hypotheses):                                         │
│  └─ Most probable cause: H[N] because [evidence]                       │
│                                                                        │
│  From SUMMONING (Spectral Traces):                                     │
│  └─ Divergence at: [step/line] where Ghost ≠ Demon                     │
│                                                                        │
│  From INQUISITION (Assumption Collapse):                               │
│  └─ False assumption: A[N] — [what was wrong]                          │
│                                                                        │
│  CONVERGENCE CHECK:                                                    │
│  Do all three point to the same root cause?                            │
│  ├─ YES → High confidence diagnosis. Proceed to EXORCISM.              │
│  ├─ PARTIAL → Some evidence conflicts. Investigate discrepancy.        │
│  └─ NO → Restart: your hypothesis or trace was wrong.                  │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

ROOT CAUSE STATEMENT:

Write the root cause in this exact format:

ROOT CAUSE:
The bug occurs because [FALSE ASSUMPTION/CONDITION], which causes 
[MECHANISM by which this leads to divergence], resulting in 
[OBSERVED SYMPTOM] instead of [EXPECTED BEHAVIOR].

Example:
"The bug occurs because the `created_at` timestamp is stored in local time 
(not UTC as assumed), which causes the time comparison in `isExpired()` to 
be off by 5 hours for users in EST, resulting in tokens expiring prematurely 
instead of after 24 hours."

Phase 6: EXORCISM (Bug Elimination)

Now remove the demon. But carefully — exorcisms can go wrong.

EXORCISM PROTOCOL:
┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  STEP 1: FIX DESIGN (Before writing any code)                          │
│  ├─ What is the minimal change that addresses root cause?              │
│  ├─ Does fix address the root cause or just the symptom?               │
│  ├─ What could this fix break? (Blast radius of change)                │
│  ├─ Are there other code paths with same false assumption?             │
│  └─ Alternative fixes considered: [list with tradeoffs]                │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 2: SHADOW TRACE (Pre-verify fix)                                 │
│  Before implementing, trace through the fix mentally:                  │
│  ├─ Run Ghost trace with proposed fix                                  │
│  ├─ Run Demon trace with proposed fix                                  │
│  ├─ Verify they now converge                                           │
│  └─ Check: Does fix introduce new divergences elsewhere?               │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 3: IMPLEMENTATION                                                │
│  ├─ Implement the minimal fix                                          │
│  ├─ Add defensive code if assumption could be wrong again              │
│  ├─ Add comments explaining WHY (not what)                             │
│  └─ If fix is complex, break into atomic commits                       │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 4: RESURRECTION CHECK (Verify demon is gone)                     │
│  ├─ Does original reproduction still trigger bug? (Must be NO)         │
│  ├─ Do variations of reproduction trigger bug? (Test edge cases)       │
│  ├─ Run full Demon trace again — does it match Ghost now?              │
│  └─ Run existing tests — any new failures?                             │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Output: Verified fix with evidence that root cause is addressed.

Phase 7: CONSECRATION (Fortification Against Re-Possession)

The bug is dead. Now prevent its return.

CONSECRATION PROTOCOL:
┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  STEP 1: REGRESSION TEST                                               │
│  ├─ Write test that would have caught this bug                         │
│  ├─ Test should encode the FALSE assumption explicitly                 │
│  └─ Test name should describe the scenario, not the fix                │
│      Good: "test_token_expiry_handles_non_utc_timestamps"              │
│      Bad: "test_bug_fix_issue_123"                                     │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 2: ASSERTION HARDENING                                           │
│  ├─ Add runtime assertions for violated assumptions                    │
│  ├─ Fail fast and loud, don't silently corrupt                         │
│  └─ Example: assert(timestamp.tzinfo == UTC, "timestamps must be UTC") │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 3: DOCUMENTATION                                                 │
│  ├─ Document the assumption that was violated                          │
│  ├─ Add to function docstring: preconditions/postconditions            │
│  └─ If architectural assumption: add to ADR or system docs             │
│                                                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  STEP 4: SIMILAR LOCATION AUDIT                                        │
│  ├─ Search codebase for same pattern                                   │
│  ├─ Does same false assumption exist elsewhere?                        │
│  └─ Fix proactively or document as known tech debt                     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

PART II: THE REVIEW PATH (Latent Bug Detection)

For code review, pre-emptively apply PHANTOM to find bugs before they manifest.

Phase R1: PREEMPTIVE MANIFESTATION

Instead of observing symptoms, IMAGINE potential symptoms:
- What could go wrong with this code?
- What inputs would break it?
- What environment conditions would cause failure?
- What sequence of operations would trigger issues?

Phase R2: SPECULATIVE SUMMONING

For each code path:
- Construct Ghost trace (what author intended)
- Construct Demon trace (what code actually does)
- Look for divergences WITHOUT a reported bug
- Each divergence is a LATENT BUG

Phase R3: ASSUMPTION EXCAVATION

Extract ALL assumptions the code makes:
- Mark each as VALIDATED or UNVALIDATED
- For unvalidated: What would happen if false?
- Flag unvalidated assumptions as REVIEW COMMENTS

Phase R4: ADVERSARIAL INPUT GENERATION

Generate inputs designed to violate assumptions:
- Boundary values
- Type confusion (string where number expected)
- Empty/null/undefined
- Extremely large/small
- Malformed but parseable
- Race conditions (concurrent modifications)

Phase R5: DEFENSIVE RECOMMENDATIONS

For each latent bug found:
- Recommend specific hardening
- Prioritize by: likelihood × impact
- Provide example attack vector

PART III: THE GENERATION PATH (Bug-Resistant Code Creation)

When writing new code, apply PHANTOM principles prophylactically.

Phase G1: ASSUMPTION DECLARATION

Before writing code, explicitly declare all assumptions:

ASSUMPTION CONTRACT:
┌─────────────────────────────────────────────────────────────┐
│ This code ASSUMES:                                          │
│ 1. Input `data` is non-null array of objects                │
│ 2. Each object has `id` field (string, unique)              │
│ 3. Database connection is established before call           │
│ 4. Function is never called concurrently with same id       │
│ 5. ...                                                      │
│                                                             │
│ This code GUARANTEES:                                       │
│ 1. Returns sorted array by id                               │
│ 2. Throws DatabaseError if connection fails                 │
│ 3. Never modifies input array                               │
│ 4. ...                                                      │
└─────────────────────────────────────────────────────────────┘

Phase G2: GHOST-FIRST DEVELOPMENT

Write the Ghost trace BEFORE writing code:

1. Document expected behavior step-by-step
2. Define expected state at each checkpoint
3. THEN write code to match the Ghost
4. Verify Demon trace matches Ghost

Phase G3: ASSUMPTION ENFORCEMENT

For each assumption declared:

- Add validation at entry point
- Add assertion at critical points
- Add type hints/annotations
- Add documentation

Phase G4: INVERSION TESTING IN TESTS

Write tests that INVERT each assumption:

def test_handles_null_input():        # Inverts: "input is non-null"
def test_handles_empty_array():       # Inverts: "array has items"
def test_handles_duplicate_ids():     # Inverts: "ids are unique"
def test_handles_concurrent_calls():  # Inverts: "never called concurrently"

Phase G5: DEVIL'S ADVOCATE REVIEW

After writing, become the adversary:

"If I wanted to break this code, how would I?"
"What input would cause the worst damage?"
"What environment condition would cause silent corruption?"

PART IV: TOOL INTEGRATION

SEQUENTIAL THINKING: THE PHANTOM'S SPINE

Sequential Thinking is MANDATORY for PHANTOM execution. It provides the structured reasoning backbone that prevents skipped steps and enables backtracking when traces reveal new information.

Core Principle

Every PHANTOM phase that involves multi-step reasoning MUST use Sequential Thinking. The tool isn't optional enhancement — it's the execution engine.

MANDATORY INVOCATION POINTS

Phase	Trigger	Why ST is Required
MANIFESTATION	Symptom has 3+ aspects	Prevents missing symptom dimensions
DIVINATION	Always (generating hypotheses)	Ensures diverse hypothesis exploration
SUMMONING	Always (tracing execution)	Maintains trace accuracy, enables revision
INQUISITION	Always (testing inversions)	Systematic assumption coverage
TRIANGULATION	Evidence conflicts	Resolves contradictions methodically
EXORCISM	Fix has dependencies	Prevents incomplete fixes

PHASE-SPECIFIC INVOCATION PATTERNS

MANIFESTATION Phase

INVOKE SEQUENTIAL THINKING when symptom is complex:

Thought 1: "What is the PRIMARY observable symptom?"
           → Document exact behavior seen
           
Thought 2: "What is the EXPECTED behavior instead?"
           → Be specific: values, timing, output format
           
Thought 3: "What is the DELTA between observed and expected?"
           → Quantify the difference precisely
           
Thought 4: "What is the MINIMAL reproduction path?"
           → Fewest steps to trigger
           
Thought 5: "What VARIATIONS affect the symptom?"
           → What makes it better/worse/different?
           
Thought 6: [REVISION if needed] "Did I miss any symptom aspects?"
           → Check for secondary symptoms

Parameters:
- totalThoughts: 5-8
- needsMoreThoughts: true (symptoms often have hidden aspects)
- Allow revisions: YES

DIVINATION Phase (Hypothesis Generation)

INVOKE SEQUENTIAL THINKING — ALWAYS for this phase:

Thought 1: "Given symptom [X], what would CAUSE this exact behavior?"
           → First hypothesis from most obvious cause
           
Thought 2: "What ELSE could cause identical symptoms?"
           → Second hypothesis, different category
           
Thought 3: "What if the cause is in [different layer/component]?"
           → Third hypothesis, different system area
           
Thought 4: "What's the most UNLIKELY but possible cause?"
           → Fourth hypothesis, edge case thinking
           
Thought 5: "What would a TIMING/RACE issue look like here?"
           → Fifth hypothesis, concurrency angle
           
Thought 6: "For each hypothesis, what DISTINGUISHING TEST exists?"
           → Design experiments to differentiate
           
Thought 7: [REVISION] "Are any hypotheses actually the same? Merge them."
           → Consolidate overlapping hypotheses
           
Thought 8: "Rank hypotheses by: (a) explains symptoms, (b) testability"
           → Prioritize investigation order

Parameters:
- totalThoughts: 8-12
- needsMoreThoughts: true
- Allow revisions: YES (hypothesis refinement is expected)
- Use branch_from_thought: when exploring alternative hypothesis families

SUMMONING Phase (Ghost Trace)

INVOKE SEQUENTIAL THINKING for GHOST TRACE:

Thought 1: "What is the ENTRY POINT for this code path?"
           → Function/method that starts the flow
           
Thought 2: "Step 1: [operation]. Expected state after: [values]"
           → First execution step
           
Thought 3: "Step 2: [operation]. Expected state after: [values]"
           → Continue tracing
           
Thought N: [Continue until expected output]

Thought N+1: "CHECKPOINT: Am I deriving expected behavior from SPEC or CODE?"
             → Must be from spec/requirements, NOT from reading buggy code
             
Thought N+2: [REVISION if checkpoint fails] "Re-derive from requirements..."

Parameters:
- totalThoughts: 10-25 (scales with code complexity)
- needsMoreThoughts: true (traces often longer than expected)
- Allow revisions: YES (finding spec ambiguity requires backtrack)
- isRevision: true when correcting earlier trace step

SUMMONING Phase (Demon Trace)

INVOKE SEQUENTIAL THINKING for DEMON TRACE:

Thought 1: "Line [N]: What ACTUALLY executes? What values ACTUALLY exist?"
           → Read code literally, no assumptions
           
Thought 2: "Line [N+1]: Given actual state, what happens next?"
           → Follow actual control flow
           
Thought 3: "POSSESSION CHECK: Am I reading or remembering?"
           → Verify I'm tracing actual code, not assumptions
           
Thought 4: [Continue line-by-line trace]

Thought N: "DIVERGENCE SCAN: Where does Demon differ from Ghost?"
           → Compare traces explicitly
           
Thought N+1: [If divergence found] "First divergence at step [X]"
             → Mark exact divergence point
             
Thought N+2: [If no divergence] "Traces match but bug exists. Options:"
             → Either trace is wrong, or wrong path traced

Parameters:
- totalThoughts: 10-25 (must match Ghost trace depth)
- needsMoreThoughts: true
- Allow revisions: YES
- CRITICAL: Do NOT copy Ghost trace. Trace independently.

INQUISITION Phase (Assumption Inversion)

INVOKE SEQUENTIAL THINKING — ALWAYS for this phase:

Thought 1: "INVENTORY: List every assumption this code makes"
           → Comprehensive assumption extraction
           
Thought 2: "For assumption A1: [statement]"
           → State assumption clearly
           
Thought 3: "INVERT A1: What if [opposite] is true?"
           → Negate the assumption
           
Thought 4: "If ¬A1, would this explain observed symptoms?"
           → Test explanatory power
           
Thought 5: "How can I TEST whether A1 or ¬A1 is reality?"
           → Design distinguishing experiment
           
Thought 6: [Repeat for A2, A3, etc.]

Thought N: [If inversion explains symptoms] "COLLAPSE DETECTED: A[X] is FALSE"
           → Mark the collapsed assumption
           
Thought N+1: [BRANCH] "What other assumptions depend on A[X]?"
             → Cascade analysis

Parameters:
- totalThoughts: 10-20 (scales with assumption count)
- needsMoreThoughts: true (always more assumptions than you think)
- Allow revisions: YES
- Use branch_from_thought: when exploring assumption dependencies
- Use branchId: "cascade-A3" when tracing collapsed assumption effects

TRIANGULATION Phase

INVOKE SEQUENTIAL THINKING when evidence conflicts:

Thought 1: "DIVINATION concluded: [hypothesis H[N]]"
           → State hypothesis finding
           
Thought 2: "SUMMONING found divergence at: [step/line]"
           → State trace finding
           
Thought 3: "INQUISITION collapsed assumption: [A[N]]"
           → State assumption finding
           
Thought 4: "Do all three point to SAME root cause?"
           → Check convergence
           
Thought 5: [If YES] "Convergent diagnosis: [root cause statement]"
           → Proceed to EXORCISM
           
Thought 5: [If NO] "Conflict between [X] and [Y]. Investigating..."
           → Identify specific conflict
           
Thought 6: [REVISION] "Re-examining [conflicting evidence]..."
           → Resolve conflict through re-analysis
           
Thought 7: "Resolution: [which evidence was wrong and why]"

Parameters:
- totalThoughts: 5-10
- needsMoreThoughts: true (conflicts require iteration)
- Allow revisions: YES (this phase IS about revision)
- isRevision: true when re-examining earlier conclusions

EXORCISM Phase (Fix Design)

INVOKE SEQUENTIAL THINKING for complex fixes:

Thought 1: "Root cause is: [statement]"
           → Confirm diagnosis
           
Thought 2: "MINIMAL fix option 1: [approach]"
           → First fix strategy
           
Thought 3: "MINIMAL fix option 2: [alternative approach]"
           → Alternative strategy
           
Thought 4: "Tradeoffs: Option 1 [pros/cons] vs Option 2 [pros/cons]"
           → Compare approaches
           
Thought 5: "Selected: Option [N] because [reasoning]"
           → Commit to approach
           
Thought 6: "SHADOW TRACE: With this fix, Ghost trace becomes..."
           → Pre-verify fix mentally
           
Thought 7: "SHADOW TRACE: With this fix, Demon trace becomes..."
           → Verify Demon now matches Ghost
           
Thought 8: "Blast radius: This fix could affect [list]"
           → Impact analysis
           
Thought 9: [If blast radius concerning] "Mitigation: [approach]"

Parameters:
- totalThoughts: 8-12
- needsMoreThoughts: true (fixes have ripple effects)
- Allow revisions: YES

SEQUENTIAL THINKING PARAMETER REFERENCE

Phase	totalThoughts	needsMoreThoughts	Allow Revisions	Allow Branches
MANIFESTATION	5-8	true	yes	no
DIVINATION	8-12	true	yes	yes
SUMMONING (Ghost)	10-25	true	yes	no
SUMMONING (Demon)	10-25	true	yes	no
INQUISITION	10-20	true	yes	yes
TRIANGULATION	5-10	true	yes	yes
EXORCISM	8-12	true	yes	no
CONSECRATION	4-6	false	yes	no

WHEN TO USE BRANCHING

Use branch_from_thought and branchId when:

DIVINATION: Exploring alternative hypothesis families

Thought 5: "Branching to explore timing-related hypotheses"
branch_from_thought: 3
branchId: "timing-hypotheses"

INQUISITION: Tracing assumption dependency cascades

Thought 12: "If A3 is false, what other assumptions fall?"
branch_from_thought: 11
branchId: "cascade-from-A3"

TRIANGULATION: Resolving conflicting evidence

Thought 6: "Exploring alternative interpretation of trace..."
branch_from_thought: 4
branchId: "alt-interpretation"

WHEN TO USE REVISION

Use isRevision: true and revises_thought when:

Trace correction: Earlier step was wrong

Thought 15: "REVISION: Step 8 was incorrect. Variable X was actually..."
isRevision: true
revises_thought: 8

Hypothesis refinement: Initial hypothesis was too broad

Thought 9: "REVISION: H2 should be split into H2a and H2b..."
isRevision: true
revises_thought: 4

Possession check failure: Mental model was corrupted

Thought 20: "REVISION: I was assuming, not reading. Re-tracing from line 42..."
isRevision: true
revises_thought: 12

DO NOT USE SEQUENTIAL THINKING FOR

Simple, single-step operations (e.g., "add a log statement")
Code implementation during EXORCISM (use normal coding)
Test execution (run the tests, don't think about running them)
Final documentation in CONSECRATION (just write it)

ANTI-PATTERNS

❌ Skipping ST because "this bug is simple" → Simple bugs often aren't. ST costs little, catches much.

❌ Using ST without revisions enabled → Debugging IS revision. Disable at your peril.

❌ Copying Ghost trace into Demon trace → Defeats the entire purpose. Trace independently.

❌ Setting totalThoughts too low → Use needsMoreThoughts: true and let it expand.

❌ Not using branches for hypothesis exploration → Linear thinking misses the bug hiding in branch B.

PART V: THE UNIFIED COGNITIVE HYPERCLUSTER

PHANTOM is the master orchestrator that dispatches to your full cognitive arsenal:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        THE COGNITIVE HYPERCLUSTER                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│                         ┌─────────────────────┐                             │
│                         │  PHANTOM PROTOCOL   │                             │
│                         │  (Master Dispatch)  │                             │
│                         └──────────┬──────────┘                             │
│                                    │                                        │
│            ┌───────────────────────┼───────────────────────┐                │
│            │                       │                       │                │
│            ▼                       ▼                       ▼                │
│  ┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐           │
│  │ AUTONOMOUS      │   │ DARWIN-GÖDEL    │   │ CODING          │           │
│  │ COGNITIVE       │   │ MACHINE         │   │ PLAYBOOK        │           │
│  │ ENGINE          │   │                 │   │                 │           │
│  │                 │   │ Evolve & Verify │   │ KISS/YAGNI/     │           │
│  │ Sequential/     │   │ Solutions       │   │ SOLID           │           │
│  │ Parallel/Audit  │   │                 │   │ Constraints     │           │
│  └────────┬────────┘   └────────┬────────┘   └────────┬────────┘           │
│           │                     │                     │                    │
│           └─────────────────────┴─────────────────────┘                    │
│                                 │                                          │
│                                 ▼                                          │
│                    ┌─────────────────────────┐                             │
│                    │  SEQUENTIAL THINKING    │                             │
│                    │  (Execution Backbone)   │                             │
│                    └─────────────────────────┘                             │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

SKILL DISPATCH MATRIX

PHANTOM orchestrates which skill handles which cognitive task:

PHANTOM Phase	Primary Skill	Secondary Skill	Constraints From
MANIFESTATION	ACE Sequential	—	Playbook PARSE
DIVINATION	ACE Parallel	—	—
SUMMONING	ACE Sequential	—	Playbook EXPLORE
INQUISITION	ACE Audit	—	—
TRIANGULATION	ACE Parallel	—	—
EXORCISM	Darwin-Gödel	ACE Parallel	Playbook KISS/YAGNI
CONSECRATION	ACE Audit	Playbook REFLECT	Playbook SOLID
GENERATION	Darwin-Gödel	ACE Sequential	Playbook Full Loop

AUTONOMOUS COGNITIVE ENGINE INTEGRATION

The ACE provides three reasoning modes that map directly to PHANTOM phases:

ACE Sequential Mode → PHANTOM Tracing Phases

Use Sequential Mode for step-by-step cognitive operations:

PHANTOM PHASE: SUMMONING (Ghost Trace)
ACE MODE: SEQUENTIAL

### RESTATED_GOAL
Trace expected execution path for function X with input Y

### REASONING_MODE
SEQUENTIAL

### REASONING_STEPS
1. Entry: function X called with parameters [values]
2. Line 15: Variable `data` assigned value [expected]
3. Line 18: Conditional evaluates to [true/false] because [reason]
4. Line 20: Loop iteration 1, counter = 0, accumulator = []
5. Line 20: Loop iteration 2, counter = 1, accumulator = [item1]
[Continue until expected exit]

### CANDIDATE_SOLUTIONS
CANDIDATE_1: Expected output is [value] | SCORE: 9 | STRENGTHS: Matches spec | WEAKNESSES: None identified

### FINAL_OUTPUT
Ghost trace complete. Expected final state: [state description]

ACE Parallel Mode → PHANTOM Hypothesis Phases

Use Parallel Mode for exploring competing possibilities:

PHANTOM PHASE: DIVINATION
ACE MODE: PARALLEL

### RESTATED_GOAL
Generate competing hypotheses for observed symptom [description]

### REASONING_MODE
PARALLEL

### REASONING_STEPS
BRANCH_A (State Corruption): Variable X modified between read and use
  - Would explain: [symptoms it explains]
  - Wouldn't explain: [gaps]
  - Test: Add logging at lines 42, 67

BRANCH_B (Timing Issue): Race condition in async handler
  - Would explain: [symptoms it explains]
  - Wouldn't explain: [gaps]
  - Test: Add mutex, observe if symptom disappears

BRANCH_C (Input Violation): Unexpected null in payload
  - Would explain: [symptoms it explains]
  - Wouldn't explain: [gaps]
  - Test: Add schema validation at entry point

BRANCH_D (Logic Error): Off-by-one in boundary check
  - Would explain: [symptoms it explains]
  - Wouldn't explain: [gaps]
  - Test: Unit test with boundary values

CONVERGENCE: Branch [X] most likely because [evidence]. Begin investigation there.

### CANDIDATE_SOLUTIONS
CANDIDATE_1: H1 - State Corruption | SCORE: 7 | STRENGTHS: Explains intermittent nature | WEAKNESSES: No mutation observed in logs
CANDIDATE_2: H2 - Timing Issue | SCORE: 8 | STRENGTHS: Explains async context | WEAKNESSES: Hard to reproduce
CANDIDATE_3: H3 - Input Violation | SCORE: 6 | STRENGTHS: Easy to test | WEAKNESSES: Validation exists upstream
CANDIDATE_4: H4 - Logic Error | SCORE: 5 | STRENGTHS: Simple fix | WEAKNESSES: Code looks correct

### BEST_SOLUTION
H2 (Timing Issue) selected for initial investigation based on async context of failure

### FINAL_OUTPUT
Primary hypothesis: Race condition in async handler. Test by adding synchronization.

ACE Audit Mode → PHANTOM Review Phases

Use Audit Mode for reviewing and critiquing:

PHANTOM PHASE: INQUISITION
ACE MODE: AUDIT

### RESTATED_GOAL
Audit all assumptions in code block and test inversions

### REASONING_MODE
AUDIT

### REASONING_STEPS
FINDING_1: Assumption A1 "input is non-null" - UNVALIDATED, no null check exists
FINDING_2: Assumption A2 "array is sorted" - VALIDATED by sort() call on line 12
FINDING_3: Assumption A3 "connection is open" - UNVALIDATED, no isConnected check
FINDING_4: Assumption A4 "user has permission" - PARTIALLY VALIDATED, check exists but incomplete

RISK_1: A1 violation would cause NullPointerException at line 34 (HIGH)
RISK_2: A3 violation would cause silent failure, corrupt return value (CRITICAL)
RISK_3: A4 violation would leak data to unauthorized user (CRITICAL)

STRENGTH_1: Error handling exists for network failures
STRENGTH_2: Input validation covers type checking

RECOMMENDATION_1: Add null guard at function entry for A1
RECOMMENDATION_2: Add connection state check before query for A3
RECOMMENDATION_3: Complete permission check to cover all access paths for A4

### CANDIDATE_SOLUTIONS
CANDIDATE_1: Fix A3 only (highest risk) | SCORE: 6 | STRENGTHS: Fast | WEAKNESSES: Other risks remain
CANDIDATE_2: Fix A1, A3, A4 | SCORE: 9 | STRENGTHS: Comprehensive | WEAKNESSES: More code changes
CANDIDATE_3: Fix A3, add monitoring for A1/A4 | SCORE: 7 | STRENGTHS: Balanced | WEAKNESSES: Deferred risk

### BEST_SOLUTION
CANDIDATE_2 - Fix all three unvalidated/partial assumptions

### FINAL_OUTPUT
Three assumptions require immediate hardening:
1. Add null check at line 30
2. Add connection validation at line 45
3. Extend permission check at line 52 to cover admin paths

DARWIN-GÖDEL MACHINE INTEGRATION

Darwin-Gödel handles solution evolution after PHANTOM identifies the root cause.

EXORCISM Phase → Darwin-Gödel GENESIS + EVOLVE

PHANTOM PHASE: EXORCISM
DISPATCH TO: Darwin-Gödel Machine

HANDOFF DATA:
- Root cause: [from TRIANGULATION]
- Constraints: [from code context]
- Success criteria: [symptom elimination + no regression]

DARWIN-GÖDEL EXECUTION:
┌─────────────────────────────────────────────────────────────────────────────┐
│  PHASE 2: GENESIS (Generate Fix Candidates)                                 │
│  ├─ FIX_1: Minimal surgical fix - change only the bug line                  │
│  ├─ FIX_2: Defensive fix - add validation + fix                             │
│  ├─ FIX_3: Refactor fix - restructure to eliminate bug class                │
│  ├─ FIX_4: Comprehensive fix - fix + add tests + add docs                   │
│  └─ FIX_5: Hybrid - combine best aspects of FIX_1 and FIX_2                 │
├─────────────────────────────────────────────────────────────────────────────┤
│  PHASE 3: EVALUATE (Fitness Assessment)                                     │
│  FITNESS FUNCTION for fixes:                                                │
│  ├─ CORRECTNESS (0.40): Does it eliminate root cause?                       │
│  ├─ SAFETY (0.25): Does it avoid introducing new bugs?                      │
│  ├─ SIMPLICITY (0.20): Is it the minimal change? (KISS)                     │
│  ├─ NECESSITY (0.10): Is every line needed? (YAGNI)                         │
│  └─ MAINTAINABILITY (0.05): Can future devs understand it? (SOLID)          │
├─────────────────────────────────────────────────────────────────────────────┤
│  PHASE 4: EVOLVE (Mutation Operators for Fixes)                             │
│  ├─ SIMPLIFY: Remove unnecessary defensive code                             │
│  ├─ HARDEN: Add missing edge case handling                                  │
│  ├─ EXTRACT: Pull fix into reusable helper                                  │
│  ├─ INLINE: Collapse unnecessary abstraction                                │
│  └─ CROSSOVER: Merge best parts of top two fixes                            │
├─────────────────────────────────────────────────────────────────────────────┤
│  PHASE 5: VERIFY (Proof Gate)                                               │
│  For each fix candidate, PROVE:                                             │
│  ├─ P1: Fix addresses root cause (trace through with fix applied)           │
│  ├─ P2: Fix doesn't break existing functionality (regression check)         │
│  ├─ P3: Fix handles edge cases identified in INQUISITION                    │
│  └─ REJECT any fix that fails any proof                                     │
└─────────────────────────────────────────────────────────────────────────────┘

RETURN TO PHANTOM:
- Winning fix with fitness score
- Verification proofs
- Rejected alternatives with rejection reasons

GENERATION Path → Darwin-Gödel Full Loop

When writing new code (not debugging), use full Darwin-Gödel:

PHANTOM PHASE: GENERATION (G1-G5)
DISPATCH TO: Darwin-Gödel Machine (Full Loop)

AUGMENTED GENESIS:
For each solution candidate, pre-attach:
- Assumption Contract (from G1)
- Ghost Trace (from G2)
- Inversion Tests (from G4)

AUGMENTED FITNESS:
FITNESS(solution) = weighted_sum(
    CORRECTNESS:      Does it produce correct output?              (0.30)
    ROBUSTNESS:       Does it handle assumption inversions?        (0.25)  ← PHANTOM enhanced
    EFFICIENCY:       Time/space complexity acceptable?            (0.15)
    ASSUMPTION_CLARITY: Are assumptions explicit and enforced?     (0.15)  ← PHANTOM enhanced
    READABILITY:      KISS compliant?                              (0.10)
    EXTENSIBILITY:    SOLID compliant?                             (0.05)
)

AUGMENTED VERIFICATION:
For each candidate:
├─ Run PHANTOM Demon trace → must match Ghost trace
├─ Run PHANTOM Inversion tests → must handle gracefully
├─ Run Playbook Audit Checklist → must pass
└─ Only verified candidates survive to next generation

CODING PLAYBOOK INTEGRATION

Playbook provides constraints and quality gates throughout PHANTOM execution.

Playbook as Fitness Constraints

Every PHANTOM output must pass Playbook validation:

PLAYBOOK CONSTRAINT GATES:
┌─────────────────────────────────────────────────────────────────────────────┐
│  GATE 1: KISS CHECK (Applied to all code outputs)                           │
│  ├─ Can a mid-level dev understand this in 30 seconds?                      │
│  ├─ Are there unnecessary abstractions?                                     │
│  ├─ Is there speculative generality?                                        │
│  └─ FAIL → Simplify before proceeding                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│  GATE 2: YAGNI CHECK (Applied to all code outputs)                          │
│  ├─ Does every line serve the immediate requirement?                        │
│  ├─ Are there features "for later"?                                         │
│  ├─ Are there unused parameters or config options?                          │
│  └─ FAIL → Remove speculative code                                          │
├─────────────────────────────────────────────────────────────────────────────┤
│  GATE 3: SOLID CHECK (Applied to structural changes)                        │
│  ├─ Single Responsibility: Does each unit do one thing?                     │
│  ├─ Liskov Substitution: Do subtypes honor contracts?                       │
│  ├─ Dependency Inversion: Are boundaries properly abstracted?               │
│  └─ FAIL → Refactor structure                                               │
├─────────────────────────────────────────────────────────────────────────────┤
│  GATE 4: AUDIT CHECKLIST (Applied before delivery)                          │
│  □ Edge cases handled?                                                      │
│  □ Error paths covered?                                                     │
│  □ Security implications considered?                                        │
│  □ Performance acceptable?                                                  │
│  □ Tests adequate?                                                          │
│  └─ FAIL → Address gaps before delivery                                     │
└─────────────────────────────────────────────────────────────────────────────┘

Playbook Loop Integration

PHANTOM phases enhance Playbook's mandatory loop:

ENHANCED PLAYBOOK LOOP:
┌─────────────────────────────────────────────────────────────────────────────┐
│  1. PARSE (Enhanced)                                                        │
│     ├─ Standard: What's being asked? Inputs? Outputs? Constraints?          │
│     └─ PHANTOM: Run MANIFESTATION if debugging (document symptoms)          │
├─────────────────────────────────────────────────────────────────────────────┤
│  2. EXPLORE (Enhanced)                                                      │
│     ├─ Standard: Consider 2+ approaches, document tradeoffs                 │
│     ├─ ACE: Use Parallel Mode for approach comparison                       │
│     └─ PHANTOM: Run DIVINATION if debugging (generate hypotheses)           │
├─────────────────────────────────────────────────────────────────────────────┤
│  3. PLAN (Enhanced)                                                         │
│     ├─ Standard: Ordered implementation steps                               │
│     ├─ ACE: Use Sequential Mode for step planning                           │
│     ├─ PHANTOM: Run SUMMONING (Ghost trace before coding)                   │
│     └─ Darwin-Gödel: If complex, run GENESIS to generate plan variants      │
├─────────────────────────────────────────────────────────────────────────────┤
│  4. BUILD (Enhanced)                                                        │
│     ├─ Standard: Implement in focused units                                 │
│     ├─ PHANTOM: Apply G2 Ghost-First Development                            │
│     ├─ Darwin-Gödel: If complex, evolve implementation candidates           │
│     └─ Playbook: KISS/YAGNI/SOLID constraints active throughout             │
├─────────────────────────────────────────────────────────────────────────────┤
│  5. REFLECT (Enhanced)                                                      │
│     ├─ Standard: Run Audit Checklist                                        │
│     ├─ ACE: Use Audit Mode for code review                                  │
│     ├─ PHANTOM: Run INQUISITION (assumption audit)                          │
│     ├─ PHANTOM: Run Demon trace (verify matches Ghost)                      │
│     └─ Darwin-Gödel: Score final solution, log lessons                      │
├─────────────────────────────────────────────────────────────────────────────┤
│  6. DELIVER (Enhanced)                                                      │
│     ├─ Standard: Code first, concise explanation second                     │
│     ├─ PHANTOM: Run CONSECRATION (tests, hardening, docs)                   │
│     └─ Darwin-Gödel: Run META-IMPROVE (extract lessons)                     │
└─────────────────────────────────────────────────────────────────────────────┘

UNIFIED EXECUTION PROTOCOL

When PHANTOM activates, it orchestrates all skills in this sequence:

UNIFIED EXECUTION FLOW:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 0: TASK CLASSIFICATION                                          │   │
│  │ ├─ Is this DEBUGGING? → Full PHANTOM Debugging Path                  │   │
│  │ ├─ Is this CODE REVIEW? → PHANTOM Review Path                        │   │
│  │ ├─ Is this CODE GENERATION? → PHANTOM Generation Path + Darwin-Gödel │   │
│  │ └─ Is this REFACTORING? → Hybrid: Review → Generation                │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 1: ACTIVATE PLAYBOOK CONSTRAINTS                                │   │
│  │ Load KISS/YAGNI/SOLID gates as active constraints                    │   │
│  │ All subsequent outputs must pass these gates                         │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 2: INVOKE SEQUENTIAL THINKING                                   │   │
│  │ ST is the execution backbone for all phases                          │   │
│  │ Configure parameters per phase (see ST Integration section)          │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 3: EXECUTE PHANTOM PHASES                                       │   │
│  │ Each phase dispatches to appropriate ACE mode:                       │   │
│  │ ├─ MANIFESTATION → ACE Sequential                                    │   │
│  │ ├─ DIVINATION → ACE Parallel                                         │   │
│  │ ├─ SUMMONING → ACE Sequential (x2: Ghost + Demon)                    │   │
│  │ ├─ INQUISITION → ACE Audit                                           │   │
│  │ └─ TRIANGULATION → ACE Parallel                                      │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 4: INVOKE DARWIN-GÖDEL FOR SOLUTION EVOLUTION                   │   │
│  │ ├─ GENESIS: Generate fix/solution candidates                         │   │
│  │ ├─ EVALUATE: Score against PHANTOM-augmented fitness function        │   │
│  │ ├─ EVOLVE: Mutate and crossover candidates                           │   │
│  │ ├─ VERIFY: Prove improvements (use PHANTOM traces as proofs)         │   │
│  │ └─ CONVERGE: Select winning solution                                 │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 5: PLAYBOOK GATE CHECK                                          │   │
│  │ Run all solutions through KISS/YAGNI/SOLID/Audit gates               │   │
│  │ Reject or iterate on failures                                        │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 6: CONSECRATION + META-IMPROVE                                  │   │
│  │ ├─ PHANTOM CONSECRATION: Tests, hardening, documentation             │   │
│  │ ├─ Darwin-Gödel META-IMPROVE: Extract process lessons                │   │
│  │ └─ Grimoire entry: Log learnings for future                          │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ STEP 7: DELIVER                                                      │   │
│  │ Code first, explanation second, internal reasoning stays internal    │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

CROSS-SKILL DATA CONTRACTS

Skills communicate through standardized data structures:

PHANTOM → Darwin-Gödel Handoff

PHANTOM_DIAGNOSIS {
    root_cause: string,           // From TRIANGULATION
    symptom_profile: {
        observed: string,
        expected: string,
        delta: string,
        reproduction: string[]
    },
    collapsed_assumptions: [      // From INQUISITION
        {
            assumption: string,
            reality: string,
            evidence: string
        }
    ],
    divergence_point: {           // From SUMMONING
        step: number,
        ghost_value: any,
        demon_value: any,
        location: string          // file:line
    },
    constraints: {                // From Playbook
        kiss_requirements: string[],
        yagni_scope: string,
        solid_concerns: string[]
    }
}

Darwin-Gödel → PHANTOM Return

DARWIN_GODEL_SOLUTION {
    winning_fix: {
        code: string,
        approach: string,
        fitness_score: number
    },
    verification_proofs: [
        {
            claim: string,
            proof_type: string,    // test | trace | logical
            evidence: string
        }
    ],
    rejected_alternatives: [
        {
            approach: string,
            rejection_reason: string
        }
    ],
    lessons_extracted: string[]
}

ACE Output → All Skills

ACE_REASONING_OUTPUT {
    mode: "SEQUENTIAL" | "PARALLEL" | "AUDIT",
    steps: Step[] | Branch[] | Finding[],
    candidates: [
        {
            description: string,
            score: number,
            strengths: string[],
            weaknesses: string[]
        }
    ],
    best_solution: string,
    final_output: any
}

HYPERCLUSTER ACTIVATION TRIGGERS

The unified system activates based on these patterns:

Trigger Pattern	Activation
"debug", "fix", "why isn't working"	PHANTOM Debug Path → Darwin-Gödel EXORCISM
"review", "check", "audit"	PHANTOM Review Path → ACE Audit Mode
"write", "create", "implement"	PHANTOM Generation Path → Darwin-Gödel Full Loop
"refactor", "improve", "optimize"	Review Path → Generation Path (hybrid)
"design", "architect", "plan"	ACE Parallel → Darwin-Gödel GENESIS → Playbook SOLID
Complex multi-step task	Full Hypercluster activation

COGNITIVE MODE SELECTION MATRIX

When multiple skills could apply, use this precedence:

PROBLEM COMPLEXITY ASSESSMENT:
┌───────────────────────────────────────────────────────────────────────┐
│ Complexity Score = sum of:                                            │
│ ├─ Components involved: 1 (single) / 3 (multiple) / 5 (system-wide)   │
│ ├─ Uncertainty level: 1 (clear) / 3 (some unknowns) / 5 (high)        │
│ ├─ Failure cost: 1 (low) / 3 (moderate) / 5 (critical)                │
│ └─ Solution diversity: 1 (obvious) / 3 (few options) / 5 (many)       │
└───────────────────────────────────────────────────────────────────────┘

ACTIVATION BY SCORE:
├─ Score 4-6:   Playbook Loop only (simple task)
├─ Score 7-10:  Playbook + ACE (moderate complexity)
├─ Score 11-14: Playbook + ACE + PHANTOM (complex debugging/review)
├─ Score 15-17: Full Hypercluster (complex generation/design)
└─ Score 18-20: Full Hypercluster + Extended ST (critical/novel problem)

FAILURE RECOVERY PROTOCOLS

When a skill fails or produces unsatisfactory results:

RECOVERY PROTOCOL:
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHANTOM FAILURE (can't find root cause):                                    │
│ ├─ Expand DIVINATION: Add more hypothesis categories                        │
│ ├─ Deepen SUMMONING: Trace more code paths                                  │
│ ├─ Widen INQUISITION: Question "obvious" assumptions                        │
│ └─ Escalate: Ask user for more reproduction details                         │
├─────────────────────────────────────────────────────────────────────────────┤
│ DARWIN-GÖDEL FAILURE (no fix passes verification):                          │
│ ├─ Relax fitness function: Accept lower-scoring solutions                   │
│ ├─ Expand GENESIS: Generate more diverse fix approaches                     │
│ ├─ Re-run PHANTOM: Root cause may be misidentified                          │
│ └─ Escalate: Present best partial fix with caveats                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ ACE FAILURE (mode produces poor reasoning):                                 │
│ ├─ Switch modes: Try Parallel if Sequential stuck, or vice versa            │
│ ├─ Increase candidates: Generate more options                               │
│ ├─ Add constraints: Narrow solution space                                   │
│ └─ Invoke ST: Use Sequential Thinking for step-by-step unsticking           │
├─────────────────────────────────────────────────────────────────────────────┤
│ PLAYBOOK GATE FAILURE (solution fails KISS/YAGNI/SOLID):                    │
│ ├─ Identify specific violation                                              │
│ ├─ Feed violation back to Darwin-Gödel as constraint                        │
│ ├─ Re-evolve with tighter fitness function                                  │
│ └─ If persistent: Simplify problem scope, solve incrementally               │
└─────────────────────────────────────────────────────────────────────────────┘

UNIFIED OUTPUT FORMAT

When running full Hypercluster, output in this structure:

## HYPERCLUSTER EXECUTION LOG

### TASK CLASSIFICATION
- Type: [DEBUG | REVIEW | GENERATE | REFACTOR]
- Complexity Score: [4-20]
- Skills Activated: [list]

### PHASE EXECUTION

#### PHANTOM: [phase name]
[Phase output in PHANTOM format]
ACE Mode Used: [SEQUENTIAL | PARALLEL | AUDIT]

#### DARWIN-GÖDEL: [phase name]
[Phase output in Darwin-Gödel format]
Generation: [N]
Population: [candidates with fitness scores]

#### PLAYBOOK GATE: [gate name]
[Pass/Fail with details]

### SOLUTION

```[language]
[Final code]

VERIFICATION

PHANTOM Traces: [Ghost/Demon convergence status]
Darwin-Gödel Proofs: [list of proofs]
Playbook Compliance: [KISS ✓ | YAGNI ✓ | SOLID ✓]

LESSONS CAPTURED

[Grimoire entries]


---

## DIAGNOSTIC OUTPUT TEMPLATES

### Bug Investigation Report

```markdown
## PHANTOM DIAGNOSTIC REPORT

### MANIFESTATION
- **Observed:** [exact symptom]
- **Expected:** [correct behavior]
- **Delta:** [specific difference]
- **Reproduction:** [minimal steps]

### DIVINATION (Hypotheses Considered)
| # | Hypothesis | Probability | Status |
|---|------------|-------------|--------|
| H1 | [description] | 7/10 | REFUTED |
| H2 | [description] | 4/10 | CONFIRMED ← |
| H3 | [description] | 2/10 | REFUTED |

### SUMMONING (Spectral Divergence)
- **First divergence at:** [step/line]
- **Ghost expected:** [value/behavior]
- **Demon produced:** [value/behavior]

### INQUISITION (Collapsed Assumptions)
- **False assumption:** [what you thought was true]
- **Reality:** [what's actually true]
- **Evidence:** [how discovered]

### ROOT CAUSE
[Formal statement following template]

### EXORCISM
- **Fix applied:** [description]
- **Verified by:** [tests/traces]
- **Blast radius:** [what else could be affected]

### CONSECRATION
- **Regression test added:** [yes/no, name]
- **Similar patterns found:** [count, locations]
- **Documentation updated:** [yes/no, where]

QUICK-START HEURISTICS

When time is critical:

MANIFESTATION: 30 seconds to document symptom precisely
DIVINATION: Generate top 3 hypotheses
SUMMONING: Quick Ghost/Demon trace of suspicious area only
INQUISITION: Test inversion of #1 hypothesis's key assumption
Fix and verify

When bug is critical/subtle:

Full MANIFESTATION with reproduction signature
DIVINATION with 5+ hypotheses across all categories
SUMMONING with complete traces, multiple paths
INQUISITION with exhaustive assumption inventory
TRIANGULATION before any fix attempt
Full CONSECRATION with codebase audit

META-COGNITIVE CHECKPOINTS

Insert these at each phase transition:

PHASE TRANSITION CHECK:
□ Did I actually complete the phase or skip ahead?
□ Am I making new assumptions I haven't logged?
□ Is my mental model of the code still accurate?
□ Have I verified or just assumed?
□ Am I hunting the bug or defending my first guess?

ADVERSARIAL SELF-CHECK (Pre-Delivery)

Before declaring a bug fixed:

"If I were an adversarial QA, how would I break this fix?"
"What's the laziest interpretation of my fix that still passes tests?"
"Did I fix the symptom or the root cause?"
"What would bring this bug back in 6 months?"
"If this fix is wrong, what's the blast radius?"

THE GRIMOIRE (Lessons Captured)

After each debugging session, record:

GRIMOIRE ENTRY:
┌────────────────────────────────────────────────────────────────────────┐
│ Date: [date]                                                           │
│ Bug ID/Description: [reference]                                        │
│ Time to diagnosis: [duration]                                          │
│ Root cause category: [from categories list]                            │
│                                                                        │
│ FALSE ASSUMPTION THAT BIT ME:                                          │
│ [The specific thing I assumed that was wrong]                          │
│                                                                        │
│ HOW I COULD HAVE FOUND IT FASTER:                                      │
│ [Specific diagnostic I should have run earlier]                        │
│                                                                        │
│ PATTERN TO RECOGNIZE:                                                  │
│ [Symptom pattern that should trigger this suspicion in future]         │
│                                                                        │
│ PREVENTION LESSON:                                                     │
│ [What to do differently when writing new code]                         │
└────────────────────────────────────────────────────────────────────────┘

PART VI: ANTI-HALLUCINATION FORTRESS

Hallucinations are the enemy of debugging. A single fabricated detail can send you chasing phantom bugs for hours. This section establishes mandatory verification checkpoints that prevent hallucinated reasoning from contaminating the diagnostic process.

THE HALLUCINATION THREAT MODEL

HALLUCINATION CATEGORIES IN DEBUGGING:
┌─────────────────────────────────────────────────────────────────────────────┐
│ TYPE 1: FABRICATED CODE BEHAVIOR                                            │
│ "This function returns null when X" — but you never verified this           │
│ Cause: Pattern matching from similar code, not THIS code                    │
│ Danger: Entire diagnosis built on false premise                             │
├─────────────────────────────────────────────────────────────────────────────┤
│ TYPE 2: INVENTED VARIABLE VALUES                                            │
│ "At this point, count = 5" — but you derived this, didn't observe it        │
│ Cause: Mental simulation diverged from actual execution                     │
│ Danger: Trace appears valid but is fiction                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│ TYPE 3: ASSUMED API/LIBRARY BEHAVIOR                                        │
│ "Array.sort() returns a new array" — but it mutates in place (JS)           │
│ Cause: Confusion between languages or library versions                      │
│ Danger: Correct logic, wrong primitives                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│ TYPE 4: PHANTOM SYMPTOMS                                                    │
│ "The error message says X" — but you're remembering a different error       │
│ Cause: Conflating current bug with past similar bugs                        │
│ Danger: Solving wrong problem entirely                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│ TYPE 5: CONFABULATED CAUSATION                                              │
│ "A causes B because C" — plausible narrative, zero evidence                 │
│ Cause: Human need for coherent stories                                      │
│ Danger: Confident wrong diagnosis                                           │
├─────────────────────────────────────────────────────────────────────────────┤
│ TYPE 6: FALSE CONFIDENCE                                                    │
│ "I'm certain the bug is X" — but certainty isn't calibrated to evidence     │
│ Cause: Fluency mistaken for accuracy                                        │
│ Danger: Premature convergence, missed root cause                            │
└─────────────────────────────────────────────────────────────────────────────┘

MANDATORY GROUNDING CHECKPOINTS

Every PHANTOM phase must pass grounding verification before proceeding:

GROUNDING CHECKPOINT PROTOCOL:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  CHECKPOINT TYPE: SOURCE GROUNDING                                          │
│  Question: "Where EXACTLY did this information come from?"                  │
│                                                                             │
│  VALID SOURCES (can proceed):                                               │
│  ├─ Direct code reading: "Line 42 shows: if (x > 5)"                        │
│  ├─ Observed output: "Console logged: 'Error: null reference'"              │
│  ├─ Test execution: "Running test_foo() produced: AssertionError"           │
│  ├─ Documentation quote: "MDN states: 'sort() sorts in place'"              │
│  └─ User statement: "User reported: 'Button doesn't respond'"               │
│                                                                             │
│  INVALID SOURCES (must verify or discard):                                  │
│  ├─ "I think..." → STOP. Verify or mark as hypothesis.                      │
│  ├─ "Usually..." → STOP. Check if this case matches "usual."                │
│  ├─ "It probably..." → STOP. Probability claim needs evidence.              │
│  ├─ "This should..." → STOP. "Should" ≠ "does."                             │
│  └─ "Based on my experience..." → STOP. This code may differ.               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

GROUNDING CHECKPOINT PROTOCOL:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  CHECKPOINT TYPE: CLAIM VERIFICATION                                        │
│  For each factual claim, complete this:                                     │
│                                                                             │
│  CLAIM: [state the claim]                                                   │
│  SOURCE: [exact source - file:line, output, doc URL]                        │
│  VERIFIED: [YES: how / NO: mark as unverified / ASSUMED: flag risk]         │
│  CONFIDENCE: [1-10, with justification]                                     │
│                                                                             │
│  Example:                                                                   │
│  CLAIM: "The loop runs 5 times"                                             │
│  SOURCE: Mental trace of for(i=0; i<5; i++)                                 │
│  VERIFIED: NO - derived, not observed                                       │
│  CONFIDENCE: 8 - simple loop, but edge cases possible                       │
│  ACTION: Add console.log(i) to verify, or mark as assumption                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

PHASE-SPECIFIC ANTI-HALLUCINATION GATES

MANIFESTATION Phase Anti-Hallucination

BEFORE PROCEEDING FROM MANIFESTATION:
□ Symptom described using OBSERVED behavior, not interpreted behavior
□ "Expected behavior" derived from SPEC/REQUIREMENTS, not assumption
□ Reproduction steps ACTUALLY EXECUTED, not "should work"
□ Error messages COPY-PASTED, not paraphrased from memory
□ Environment details CHECKED, not assumed

HALLUCINATION RED FLAGS:
✗ "The error probably says..." → Get actual error
✗ "It fails when you do X" → Verify X triggers failure
✗ "The expected output is Y" → Confirm Y is per spec

DIVINATION Phase Anti-Hallucination

BEFORE PROCEEDING FROM DIVINATION:
□ Each hypothesis COULD be true (not ruled out by known facts)
□ No hypothesis assumes facts not in evidence
□ Distinguishing tests are ACTUALLY distinguishing (different outcomes)
□ Probability estimates based on EVIDENCE, not gut feel

HALLUCINATION RED FLAGS:
✗ "This is definitely the cause" → How do you KNOW?
✗ "I've seen this before" → Is THIS case the same?
✗ "This can't be the issue" → Why can't it?

SUMMONING Phase Anti-Hallucination

GHOST TRACE GROUNDING:
□ Expected behavior from REQUIREMENTS, not from buggy code
□ Each step cites SOURCE (spec section, design doc, user story)
□ Expected values are SPECIFIED, not "reasonable defaults"
□ Assumptions EXPLICITLY MARKED as assumptions

DEMON TRACE GROUNDING:
□ Each step traces ACTUAL CODE LINE BY LINE
□ Variable values DERIVED from code, not assumed
□ Control flow decisions JUSTIFIED by actual conditions
□ Library/API behavior VERIFIED against documentation

CRITICAL ANTI-HALLUCINATION RULE:
┌─────────────────────────────────────────────────────────────────────────────┐
│ DEMON TRACE MUST BE CONSTRUCTED INDEPENDENTLY OF GHOST TRACE                │
│                                                                             │
│ DO NOT: Read Ghost trace, then "confirm" Demon matches                      │
│ DO: Trace Demon from scratch, THEN compare                                  │
│                                                                             │
│ The moment you think "the Demon should do X here" — you're hallucinating.   │
│ The Demon does what the CODE does, not what it SHOULD do.                   │
└─────────────────────────────────────────────────────────────────────────────┘

INQUISITION Phase Anti-Hallucination

BEFORE PROCEEDING FROM INQUISITION:
□ Assumption inventory is EXHAUSTIVE, not just obvious ones
□ Each assumption STATED NEUTRALLY (not pre-judged as true/false)
□ Inversion consequences TRACED, not guessed
□ "Collapse" claims EVIDENCED, not intuited

HALLUCINATION RED FLAGS:
✗ "This assumption is obviously true" → Obvious assumptions bite hardest
✗ "Inverting this would be ridiculous" → Reality is often ridiculous
✗ "I've already validated this" → When? How? Show evidence.

TRIANGULATION Phase Anti-Hallucination

BEFORE PROCEEDING FROM TRIANGULATION:
□ Evidence from all three engines ACTUALLY CONVERGES (not forced)
□ Contradictions ACKNOWLEDGED, not glossed over
□ Root cause statement uses VERIFIED facts only
□ Confidence level CALIBRATED to evidence strength

HALLUCINATION RED FLAGS:
✗ "Everything points to X" → Does it really? Check each pointer.
✗ "The only explanation is Y" → Have you ruled out all others?
✗ "This must be it" → Must? Or most likely?

CONFIDENCE CALIBRATION SYSTEM

Never state conclusions without calibrated confidence:

CONFIDENCE CALIBRATION SCALE:
┌─────────────────────────────────────────────────────────────────────────────┐
│ LEVEL 10: PROVEN                                                            │
│ "I have executed this code and observed this exact behavior"                │
│ Evidence: Direct observation, test execution, logs                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 9: STRONGLY SUPPORTED                                                 │
│ "Multiple independent evidence sources converge on this"                    │
│ Evidence: Code reading + trace + test, all agree                            │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 8: WELL SUPPORTED                                                     │
│ "Code clearly shows this, and it's consistent with symptoms"                │
│ Evidence: Direct code reading, logical consistency                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 7: SUPPORTED                                                          │
│ "Evidence favors this interpretation"                                       │
│ Evidence: Some direct evidence, rest is consistent                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 6: LIKELY                                                             │
│ "This is the most plausible explanation given available evidence"           │
│ Evidence: Indirect evidence, reasonable inference                           │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 5: PLAUSIBLE                                                          │
│ "This could explain the symptoms, but other explanations exist"             │
│ Evidence: Consistent with facts, not contradicted                           │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 4: POSSIBLE                                                           │
│ "This is one possibility among several"                                     │
│ Evidence: Not ruled out, limited positive evidence                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 3: SPECULATIVE                                                        │
│ "This is a guess that would need verification"                              │
│ Evidence: Minimal, mostly reasoning from patterns                           │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 2: UNCERTAIN                                                          │
│ "I don't have enough information to assess this"                            │
│ Evidence: Insufficient to form judgment                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│ LEVEL 1: UNKNOWN                                                            │
│ "I have no information about this"                                          │
│ Evidence: None                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

CALIBRATION RULES:
- Never claim LEVEL 8+ without DIRECT evidence (code, logs, test output)
- LEVEL 6-7 requires at least ONE direct observation
- LEVEL 5 and below: Explicitly flag as hypothesis, not fact
- If confidence < 7, DO NOT proceed to fix. Get more evidence.

ANTI-HALLUCINATION VERBAL PATTERNS

Replace hallucination-prone language with grounded language:

DON'T SAY	DO SAY
"The bug is X"	"Evidence suggests X (confidence: 7)"
"This code does Y"	"Line 42 shows: [exact code]. This executes Y."
"The value is Z"	"Tracing line 38: input=5, so Z = 5+1 = 6"
"It should work"	"Per spec section 3.2, expected behavior is..."
"I'm sure that..."	"Based on [evidence], I assess [claim] at confidence [N]"
"Obviously..."	[Remove. Nothing is obvious. State evidence.]
"As we know..."	[Remove. Verify we actually know this.]
"This always..."	"In the cases I've examined: [list cases]"
"This never..."	"I haven't observed this occurring in: [contexts]"

HALLUCINATION RECOVERY PROTOCOL

When you catch yourself hallucinating mid-reasoning:

HALLUCINATION RECOVERY:
┌─────────────────────────────────────────────────────────────────────────────┐
│ STEP 1: ACKNOWLEDGE                                                         │
│ "I made an ungrounded claim: [claim]. Retracting."                          │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 2: TRACE CONTAMINATION                                                 │
│ What conclusions were built on this hallucination?                          │
│ List all downstream claims that depended on the false premise.              │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 3: ISOLATE                                                             │
│ Mark all contaminated conclusions as INVALID.                               │
│ Do NOT try to "salvage" reasoning built on false foundations.               │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 4: REBUILD                                                             │
│ Return to last VERIFIED checkpoint.                                         │
│ Re-derive conclusions using only grounded facts.                            │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 5: PREVENT                                                             │
│ Why did the hallucination occur?                                            │
│ Add specific check to prevent recurrence.                                   │
└─────────────────────────────────────────────────────────────────────────────┘

VERIFICATION CHECKPOINTS BY PHASE

VERIFICATION CHECKPOINT MATRIX:
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE          │ MUST VERIFY BEFORE PROCEEDING                              │
├─────────────────────────────────────────────────────────────────────────────┤
│ MANIFESTATION  │ □ Symptom actually reproduced (not just described)         │
│                │ □ Error message copy-pasted, not paraphrased               │
│                │ □ Environment details confirmed (versions, configs)        │
├─────────────────────────────────────────────────────────────────────────────┤
│ DIVINATION     │ □ Each hypothesis is falsifiable                           │
│                │ □ No hypothesis contradicts known facts                    │
│                │ □ Distinguishing tests actually distinguish               │
├─────────────────────────────────────────────────────────────────────────────┤
│ SUMMONING      │ □ Ghost trace from spec, not code                          │
│                │ □ Demon trace from code reading, not assumption            │
│                │ □ Each trace step cites source                             │
├─────────────────────────────────────────────────────────────────────────────┤
│ INQUISITION    │ □ All assumptions enumerated (not just obvious)            │
│                │ □ Inversions actually traced, not guessed                  │
│                │ □ Collapse claims have evidence                            │
├─────────────────────────────────────────────────────────────────────────────┤
│ TRIANGULATION  │ □ All three engines actually converge                      │
│                │ □ Contradictions explicitly addressed                      │
│                │ □ Confidence calibrated to evidence                        │
├─────────────────────────────────────────────────────────────────────────────┤
│ EXORCISM       │ □ Fix addresses root cause, not symptom                    │
│                │ □ Fix verified by re-running reproduction                  │
│                │ □ No new failures introduced                               │
└─────────────────────────────────────────────────────────────────────────────┘

PART VII: RECURSIVE SELF-REFLECTION ENGINE

Self-reflection isn't just a final phase — it's a continuous metacognitive process that monitors reasoning quality throughout execution.

THE THREE REFLECTION LOOPS

REFLECTION ARCHITECTURE:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  LOOP 1: MICRO-REFLECTION (Per-Step)                                        │
│  Frequency: After EVERY reasoning step                                      │
│  Duration: 5-10 seconds                                                     │
│  Question: "Did I just make an ungrounded claim?"                           │
│                                                                             │
│  LOOP 2: MESO-REFLECTION (Per-Phase)                                        │
│  Frequency: Before exiting each PHANTOM phase                               │
│  Duration: 30-60 seconds                                                    │
│  Question: "Did this phase achieve its purpose? What did I miss?"           │
│                                                                             │
│  LOOP 3: MACRO-REFLECTION (End-to-End)                                      │
│  Frequency: After completing task                                           │
│  Duration: 2-5 minutes                                                      │
│  Question: "What worked? What failed? What will I do differently?"          │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

MICRO-REFLECTION: STEP-LEVEL MONITORING

Invoke after every Sequential Thinking step:

MICRO-REFLECTION CHECKLIST (5 seconds):
□ Did I just STATE or VERIFY that claim?
□ Could I be pattern-matching instead of analyzing?
□ Am I rushing to a conclusion?
□ Would I bet $100 on this step being correct?

If any answer is concerning → PAUSE. Verify before proceeding.

Integration with Sequential Thinking:

Thought N: [Reasoning step]
Thought N+1: "MICRO-REFLECT: Step N claimed [X]. 
             Source: [where this came from]. 
             Confidence: [1-10]. 
             Proceeding: [yes/need verification]."

MESO-REFLECTION: PHASE-LEVEL ASSESSMENT

Before exiting any PHANTOM phase, complete this assessment:

MESO-REFLECTION PROTOCOL:
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE: [current phase name]                                                 │
│                                                                             │
│ PURPOSE CHECK:                                                              │
│ ├─ What was this phase supposed to accomplish?                              │
│ ├─ Did I accomplish it?                                                     │
│ └─ How do I know? (Evidence)                                                │
│                                                                             │
│ COMPLETENESS CHECK:                                                         │
│ ├─ What aspects did I cover thoroughly?                                     │
│ ├─ What aspects did I cover superficially?                                  │
│ └─ What aspects did I skip entirely?                                        │
│                                                                             │
│ QUALITY CHECK:                                                              │
│ ├─ What's the weakest part of my reasoning in this phase?                   │
│ ├─ Where am I least confident?                                              │
│ └─ What would make this analysis stronger?                                  │
│                                                                             │
│ BIAS CHECK:                                                                 │
│ ├─ Am I favoring a particular hypothesis? Why?                              │
│ ├─ Am I avoiding a particular possibility? Why?                             │
│ └─ What would someone who disagrees with me say?                            │
│                                                                             │
│ DECISION: [Proceed / Iterate / Backtrack]                                   │
│ JUSTIFICATION: [Why this decision]                                          │
└─────────────────────────────────────────────────────────────────────────────┘

MACRO-REFLECTION: END-TO-END ASSESSMENT

After completing the full debugging/review/generation task:

MACRO-REFLECTION PROTOCOL:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  SECTION 1: OUTCOME ASSESSMENT                                              │
│  ├─ Did I solve the actual problem? (Not just a problem)                    │
│  ├─ How confident am I in the solution? [1-10]                              │
│  ├─ What evidence supports this confidence?                                 │
│  └─ What could prove me wrong?                                              │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SECTION 2: PROCESS ASSESSMENT                                              │
│  ├─ Which phase was most valuable? Why?                                     │
│  ├─ Which phase was least valuable? Why?                                    │
│  ├─ Where did I spend too much time?                                        │
│  ├─ Where did I spend too little time?                                      │
│  └─ What would I do differently next time?                                  │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SECTION 3: REASONING QUALITY ASSESSMENT                                    │
│  ├─ Grounding: How well did I verify claims? [1-10]                         │
│  ├─ Thoroughness: How exhaustive was my analysis? [1-10]                    │
│  ├─ Calibration: How accurate were my confidence estimates? [1-10]          │
│  ├─ Creativity: Did I explore non-obvious possibilities? [1-10]             │
│  └─ Rigor: Did I follow the protocol faithfully? [1-10]                     │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SECTION 4: FAILURE ANALYSIS                                                │
│  ├─ What mistakes did I make? (Be specific)                                 │
│  ├─ What caused each mistake?                                               │
│  ├─ How did I catch each mistake? (Or how should I have?)                   │
│  └─ What check would have prevented each mistake?                           │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SECTION 5: KNOWLEDGE EXTRACTION                                            │
│  ├─ What did I learn about this codebase?                                   │
│  ├─ What did I learn about this bug class?                                  │
│  ├─ What did I learn about my own reasoning?                                │
│  └─ What should I remember for next time?                                   │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SECTION 6: SELF-SCORE                                                      │
│  Overall reasoning quality: [1-10]                                          │
│  Justification: [Why this score, specifically]                              │
│                                                                             │
│  Scoring guide:                                                             │
│  10: Flawless execution, no mistakes, maximal efficiency                    │
│  8-9: Minor inefficiencies, no errors in conclusions                        │
│  6-7: Some missteps, but self-corrected, correct conclusion                 │
│  4-5: Significant mistakes, conclusion may have issues                      │
│  1-3: Major failures, conclusion likely wrong                               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

COGNITIVE BIAS DETECTION

Active monitoring for reasoning biases:

BIAS DETECTION CHECKLIST:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  CONFIRMATION BIAS                                                          │
│  Signal: Only looking for evidence that supports current hypothesis         │
│  Check: "Have I actively tried to DISPROVE my leading hypothesis?"          │
│  Cure: Force yourself to argue for the opposite position                    │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ANCHORING BIAS                                                             │
│  Signal: First hypothesis dominates despite weak evidence                   │
│  Check: "Is my confidence in H1 justified, or just because it came first?" │
│  Cure: Re-rank hypotheses from scratch, ignoring order of generation        │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  AVAILABILITY BIAS                                                          │
│  Signal: Diagnosing based on recently seen or memorable bugs                │
│  Check: "Am I thinking of THIS bug or a SIMILAR bug I remember?"            │
│  Cure: List 3 other possible causes you haven't recently encountered        │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  SUNK COST FALLACY                                                          │
│  Signal: Continuing down a path because of time already invested            │
│  Check: "If I started fresh now, would I still pursue this path?"           │
│  Cure: Evaluate current path vs. alternatives with fresh eyes               │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  PREMATURE CLOSURE                                                          │
│  Signal: Rushing to conclusion to "be done"                                 │
│  Check: "Am I concluding because I have evidence or because I'm tired?"     │
│  Cure: Force one more iteration of Devil's Advocate                         │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  OVERCONFIDENCE                                                             │
│  Signal: High certainty without proportional evidence                       │
│  Check: "Would I bet significant money on this at these odds?"              │
│  Cure: Explicitly list what could prove you wrong                           │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ATTRIBUTION ERROR                                                          │
│  Signal: Blaming developer incompetence vs. systemic issues                 │
│  Check: "Am I explaining this bug by 'bad code' or root causes?"            │
│  Cure: Ask "What made this bug EASY to introduce?"                          │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

DEVIL'S ADVOCATE PROTOCOL

At critical decision points, FORCE adversarial self-questioning:

DEVIL'S ADVOCATE INVOCATION:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  TRIGGER: Before committing to any diagnosis or fix                         │
│                                                                             │
│  STEP 1: STATE YOUR POSITION                                                │
│  "I believe [X] because [Y]."                                               │
│                                                                             │
│  STEP 2: ARGUE THE OPPOSITE                                                 │
│  "A competent person who disagrees would say:                               │
│   [Construct strongest counter-argument you can]"                           │
│                                                                             │
│  STEP 3: FIND THE WEAKNESS                                                  │
│  "The weakest part of my original argument is:                              │
│   [Identify genuine vulnerability]"                                         │
│                                                                             │
│  STEP 4: RESPOND TO COUNTER-ARGUMENT                                        │
│  "I would rebut the counter-argument by:                                    │
│   [Genuine rebuttal, not dismissal]"                                        │
│                                                                             │
│  STEP 5: UPDATE OR PROCEED                                                  │
│  If counter-argument is stronger: Update position                           │
│  If rebuttal is stronger: Proceed with increased confidence                 │
│  If inconclusive: Get more evidence before proceeding                       │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

REASONING TRACE AUDIT

Periodically audit your own reasoning trace for quality:

REASONING TRACE AUDIT:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  For each step in your reasoning, verify:                                   │
│                                                                             │
│  VALIDITY: Does conclusion follow from premises?                            │
│  ├─ Check: Is the logical inference valid?                                  │
│  ├─ Check: Are there hidden premises I didn't state?                        │
│  └─ Check: Would someone else reach same conclusion from same premises?     │
│                                                                             │
│  SOUNDNESS: Are the premises actually true?                                 │
│  ├─ Check: Is each premise verified or assumed?                             │
│  ├─ Check: Do I have evidence for each premise?                             │
│  └─ Check: Could any premise be false?                                      │
│                                                                             │
│  RELEVANCE: Does this step advance toward the goal?                         │
│  ├─ Check: Why does this step matter?                                       │
│  ├─ Check: What would change if I skipped this step?                        │
│  └─ Check: Is this step necessary or just habitual?                         │
│                                                                             │
│  SUFFICIENCY: Is this step complete?                                        │
│  ├─ Check: Did I consider alternatives?                                     │
│  ├─ Check: Did I consider edge cases?                                       │
│  └─ Check: Did I consider failure modes?                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

SELF-REFLECTION OUTPUT FORMAT

Include in every PHANTOM execution:

### SELF-REFLECTION LOG

#### Micro-Reflections (Notable)
- Step [N]: [Concern raised, resolution]
- Step [M]: [Claim verified/retracted]

#### Meso-Reflections (Per Phase)
| Phase | Completeness | Weakest Point | Decision |
|-------|--------------|---------------|----------|
| MANIFESTATION | [%] | [weakness] | [proceed/iterate] |
| DIVINATION | [%] | [weakness] | [proceed/iterate] |
| SUMMONING | [%] | [weakness] | [proceed/iterate] |
| INQUISITION | [%] | [weakness] | [proceed/iterate] |
| TRIANGULATION | [%] | [weakness] | [proceed/iterate] |

#### Bias Check
- Confirmation bias: [detected/not detected]
- Anchoring bias: [detected/not detected]
- Other biases: [list]
- Mitigation: [actions taken]

#### Devil's Advocate
- Position: [my conclusion]
- Counter-argument: [best argument against]
- Rebuttal: [my response]
- Outcome: [position strengthened/updated/uncertain]

#### Confidence Calibration
- Pre-reflection confidence: [1-10]
- Post-reflection confidence: [1-10]
- Reason for change: [explanation]

#### Macro-Reflection
- Reasoning quality score: [1-10]
- Key mistake made: [description]
- Key insight gained: [description]
- Process improvement: [what to do differently]

INTEGRATION WITH SEQUENTIAL THINKING

Self-reflection hooks into Sequential Thinking:

SEQUENTIAL THINKING WITH REFLECTION:

Thought 1: [Reasoning step]
Thought 2: [Reasoning step]
Thought 3: [MICRO-REFLECT] "Thoughts 1-2 claimed X and Y. 
           X is verified (source: line 42). 
           Y is assumed (confidence: 6). 
           Flagging Y for verification before proceeding."
Thought 4: [Verification step for Y]
Thought 5: [Reasoning step, now with verified Y]
...
Thought N: [MESO-REFLECT] "Phase complete. 
           Completeness: 85%. 
           Weakest point: Didn't explore timeout hypothesis.
           Decision: Acceptable risk, proceeding."

Parameter additions for reflection:

REFLECTION THOUGHT PARAMETERS:
- isRevision: true (when reflection causes backtrack)
- revises_thought: [N] (when correcting earlier step)
- needsMoreThoughts: true (when reflection reveals gaps)
- branchId: "reflection-[phase]" (when exploring alternative)

THE REFLECTION MANDATE

┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  REFLECTION IS NOT OPTIONAL                                                 │
│                                                                             │
│  Every PHANTOM execution MUST include:                                      │
│  ├─ At least 1 micro-reflection per 5 reasoning steps                       │
│  ├─ Meso-reflection before exiting each phase                               │
│  ├─ Devil's Advocate before committing to diagnosis                         │
│  ├─ Bias check at TRIANGULATION                                             │
│  └─ Full macro-reflection at completion                                     │
│                                                                             │
│  Skipping reflection is how hallucinations survive.                         │
│  Reflection is the immune system of reasoning.                              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

REFLECTION-TRIGGERED ACTIONS

When reflection reveals problems:

REFLECTION ACTION MATRIX:
┌─────────────────────────────────────────────────────────────────────────────┐
│ REFLECTION FINDING          │ REQUIRED ACTION                               │
├─────────────────────────────────────────────────────────────────────────────┤
│ Ungrounded claim detected   │ STOP. Verify claim or retract.                │
│ Confidence < 6 on key claim │ STOP. Get more evidence.                      │
│ Bias detected               │ Apply specific cure from bias checklist.      │
│ Phase incomplete            │ Iterate phase until complete.                 │
│ Contradiction found         │ Resolve before proceeding.                    │
│ Devil's Advocate wins       │ Update position or get more evidence.         │
│ Reasoning quality < 5       │ Consider restarting from earlier checkpoint.  │
└─────────────────────────────────────────────────────────────────────────────┘

PART VIII: COGNITIVE PRIMITIVES (Brain Functions)

PHANTOM needs explicit cognitive operations — the mental "verbs" that constitute thinking. These primitives are the atomic operations from which all reasoning is composed.

THE COGNITIVE FUNCTION LIBRARY

COGNITIVE PRIMITIVE CATEGORIES:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  ATTENTION FUNCTIONS     │ Control what gets processed                      │
│  MEMORY FUNCTIONS        │ Store, retrieve, update information              │
│  REASONING FUNCTIONS     │ Transform information logically                  │
│  PATTERN FUNCTIONS       │ Recognize and match structures                   │
│  META-COGNITIVE FUNCTIONS│ Monitor and control other functions              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

ATTENTION FUNCTIONS

These primitives control the cognitive spotlight — what gets processed and what gets ignored.

ATTENTION PRIMITIVES:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  FOCUS(target)                                                              │
│  ├─ Purpose: Direct full attention to a specific element                    │
│  ├─ Input: Code block, variable, hypothesis, symptom                        │
│  ├─ Effect: Target enters working memory, peripheral items deprioritized    │
│  └─ Invocation: "FOCUS on line 42-58 of auth.py"                            │
│                                                                             │
│  ZOOM_IN(region)                                                            │
│  ├─ Purpose: Increase granularity of analysis                               │
│  ├─ Input: A region of code, a phase of execution                           │
│  ├─ Effect: See more detail, lose broader context temporarily               │
│  └─ Invocation: "ZOOM_IN to the loop body at line 45"                       │
│                                                                             │
│  ZOOM_OUT(scope)                                                            │
│  ├─ Purpose: Decrease granularity, see bigger picture                       │
│  ├─ Input: Current focus point                                              │
│  ├─ Effect: See patterns across larger scope, lose detail                   │
│  └─ Invocation: "ZOOM_OUT to module-level data flow"                        │
│                                                                             │
│  SPLIT_ATTENTION(targets[])                                                 │
│  ├─ Purpose: Monitor multiple items simultaneously                          │
│  ├─ Input: 2-4 targets (more causes degradation)                            │
│  ├─ Effect: Partial attention on each, good for comparison                  │
│  └─ Invocation: "SPLIT_ATTENTION between H1 evidence and H2 evidence"       │
│                                                                             │
│  SHIFT_ATTENTION(from, to)                                                  │
│  ├─ Purpose: Move focus from one target to another                          │
│  ├─ Input: Current focus, new focus                                         │
│  ├─ Effect: Old target moves to background, new target foreground           │
│  └─ Invocation: "SHIFT_ATTENTION from symptom analysis to hypothesis gen"   │
│                                                                             │
│  FILTER(criteria)                                                           │
│  ├─ Purpose: Suppress irrelevant information                                │
│  ├─ Input: Criteria for relevance                                           │
│  ├─ Effect: Only information matching criteria processed                    │
│  └─ Invocation: "FILTER for only async-related code paths"                  │
│                                                                             │
│  SUSTAIN(duration)                                                          │
│  ├─ Purpose: Maintain focus despite distractions                            │
│  ├─ Input: Target and expected duration                                     │
│  ├─ Effect: Resist urge to shift, complete current analysis                 │
│  └─ Invocation: "SUSTAIN focus on trace until divergence found"             │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

MEMORY FUNCTIONS

These primitives manage the storage and retrieval of information during investigation.

MEMORY PRIMITIVES:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  STORE(item, location, priority)                                            │
│  ├─ Purpose: Place information in memory                                    │
│  ├─ Locations: working_memory, evidence_log, hypothesis_bank, assumption_   │
│  │             registry, investigation_context                              │
│  ├─ Priority: critical (always keep) / important / normal / low             │
│  └─ Invocation: "STORE(divergence at line 42, evidence_log, critical)"      │
│                                                                             │
│  RETRIEVE(query, location)                                                  │
│  ├─ Purpose: Pull information from memory                                   │
│  ├─ Input: Search query, memory location (or 'all')                         │
│  ├─ Effect: Returns matching items ranked by relevance                      │
│  └─ Invocation: "RETRIEVE(null pointer hypotheses, hypothesis_bank)"        │
│                                                                             │
│  UPDATE(item_id, new_value)                                                 │
│  ├─ Purpose: Modify existing memory item                                    │
│  ├─ Input: Item identifier, new content                                     │
│  ├─ Effect: Item updated, old version optionally archived                   │
│  └─ Invocation: "UPDATE(H2, confidence: 4 → 7, evidence: +trace_result)"    │
│                                                                             │
│  DELETE(item_id, reason)                                                    │
│  ├─ Purpose: Remove item from memory                                        │
│  ├─ Input: Item identifier, reason for deletion                             │
│  ├─ Effect: Item removed, deletion logged                 

> Content truncated for page performance. Open the source repository for the full SKILL.md file.