deep-interview

star 31.0k

Socratic deep interview with mathematical ambiguity gating before execution

Yeachan-Heo By Yeachan-Heo schedule Updated 6/4/2026

name: deep-interview description: Socratic deep interview with mathematical ambiguity gating before execution argument-hint: "[--quick|--standard|--deep] [--autoresearch] "

Deep Interview is an intent-first Socratic clarification loop before planning or implementation. It turns vague ideas into execution-ready specifications by asking targeted questions about why the user wants a change, how far it should go, what should stay out of scope, and what OMX may decide without confirmation. - The request is broad, ambiguous, or missing concrete acceptance criteria - The user says "deep interview", "interview me", "ask me everything", "don't assume", or "ouroboros" - The user wants to avoid misaligned implementation from underspecified requirements - You need a requirements artifact before handing off to `ralplan`, `autopilot`, `ralph`, or `team` - The request already has concrete file/symbol targets and clear acceptance criteria - The user explicitly asks to skip planning/interview and execute immediately - The user asks for lightweight brainstorming only (use `plan` instead) - A complete PRD/plan already exists and execution should start Execution quality is usually bottlenecked by intent clarity, not just missing implementation detail. A single expansion pass often misses why the user wants a change, where the scope should stop, which tradeoffs are unacceptable, and which decisions still require user approval. This workflow applies Socratic pressure + quantitative ambiguity scoring so orchestration modes begin with an explicit, testable, intent-aligned spec. - **Quick (`--quick`)**: fast pre-PRD pass; target threshold `<= 0.30`; max rounds 5 - **Standard (`--standard`, default)**: full requirement interview; target threshold `<= 0.20`; max rounds 12 - **Deep (`--deep`)**: high-rigor exploration; target threshold `<= 0.15`; max rounds 20 - **Autoresearch (`--autoresearch`)**: same interview rigor as Standard, but specialized for `$autoresearch` mission readiness and `.omx/specs/` artifact handoff

Profile max rounds is a hard cap, not a target. Do not continue only to reach a numbered round count. Extra Socratic rigor does not override the active threshold unless the profile/config changes.

If no flag is provided, use Standard.

- **`--autoresearch`**: switch the interview into autoresearch-intake mode for `$autoresearch` handoff. In this mode, the interview should converge on a validator-ready research mission, write canonical artifacts under `.omx/specs/`, and preserve the explicit `refine further` vs `launch` boundary for downstream skill intake.
- Ask ONE question per round (never batch multiple interview rounds into one `questions[]` form) - Ask about intent and boundaries before implementation detail - Target the weakest clarity dimension each round after applying the stage-priority rules below - Treat every answer as a claim to pressure-test before moving on: the next question should usually demand evidence or examples, expose a hidden assumption, force a tradeoff or boundary, or reframe root cause vs symptom - Do not rotate to a new clarity dimension just for coverage when the current answer is still vague; stay on the same thread until one layer deeper, one assumption clearer, or one boundary tighter - Before crystallizing, complete at least one explicit pressure pass that revisits an earlier answer with a deeper, assumption-focused, or tradeoff-focused follow-up - Gather codebase facts via `explore` before asking user about internals - `omx explore` is deprecated. Use normal repository inspection tools/subagents for simple read-only brownfield fact gathering; use `omx sparkshell` only for explicit shell-native read-only evidence, and keep ambiguous or non-shell-only investigation on the richer normal path. - Always run a preflight context intake before the first interview question - For brownfield work, preflight must include doc/context grounding before user-facing questions: inspect applicable `AGENTS.md` files, README/getting-started docs, relevant `docs/` contracts/plans/ADRs, existing `.omx/context/` snapshots, and any project-local glossary/context files such as `CONTEXT.md` or `CONTEXT-MAP.md` when present. - Treat existing repo language as evidence, not authority: if the user uses a fuzzy, overloaded, or conflicting term, surface the specific doc/code wording and ask which meaning should govern before implementation. - Cross-check user claims about current behavior against code or documented contracts when discoverable. If docs and code disagree, ask a confirmation question that names both sources instead of silently choosing one. - Use scenario-based edge-case grilling when relationships, boundaries, or handoff behavior are unclear: invent one concrete scenario that stresses the ambiguous boundary, then ask one focused question about the expected outcome. - Durable docs, glossary, ADR, or memory updates are opt-in and public-safe only. Deep-interview may recommend such updates in the handoff summary, but must not automatically create or dump public docs from interview transcripts unless the user explicitly chooses that as in-scope. - If initial context is oversized or would exceed the prompt budget, do not paste or forward the raw payload into interview prompts; request and record a prompt-safe initial-context summary first - The oversized initial-context summary gate is blocking: wait for the concise summary before ambiguity scoring, crystallizing artifacts, or any downstream execution handoff - The summary must preserve goals, constraints, success criteria, non-goals, decision boundaries, and references to any full source documents so downstream consumers receive a prompt-safe but faithful context - Keep total prompt payloads within a safe budget by summarizing or trimming retained history; preserve newest/highest-signal answers and never let raw oversized context crowd out the current question - Reduce user effort: ask only the highest-leverage unresolved question, and never ask the user for codebase facts that can be discovered directly - For brownfield work, prefer evidence-backed confirmation questions such as "I found X in Y. Should this change follow that pattern?" - Route facts before judgment in the Ouroboros style: before presenting a user-facing interview round, classify whether the needed information is a discoverable fact, a fact needing confirmation, or a human decision. The interview is with the human for judgment, not for facts the agent can inspect. - When unresolved ambiguity depends on current external best practices, official/upstream guidance, standards, or version-aware behavior, use `$best-practice-research` as the bounded evidence wrapper before crystallizing requirements or handing off to planning/execution. - Use these transcript/spec labels only; never use them as `omx question` `source` values, and never replace the runtime `source: "deep-interview"` contract for user-facing deep-interview questions: - `[from-code][auto-confirmed]` — exact, high-confidence codebase facts from manifests/configs or direct source evidence, with no prescription attached. - `[from-code]` — codebase findings that are useful but inferred, pattern-based, or low/medium confidence and therefore need a confirmation-style user-facing round before being treated as settled. - `[from-research]` — externally sourced facts such as API limits, compatibility, or public documentation; facts only, not decisions. - `[from-user]` — goals, preferences, business logic, scope, non-goals, acceptance criteria, tradeoffs, and any decision-bearing interpretation. - Treat `[from-code][auto-confirmed]` and other non-user fact discoveries as context/transcript updates, not interview rounds: do not call `omx question`, do not create a pending deep-interview question obligation, and do not increment the user-facing round number for facts the agent can safely establish. - Auto-confirm only descriptive facts. If a finding implies what the new feature should do, which pattern it should follow, which tradeoff to accept, or what should stay in/out of scope, route the entire decision-bearing question to the user as `[from-user]` even when code or research facts are available. - In attached-tmux Codex CLI, deep-interview uses `omx question` as the required OMX-owned structured questioning path for every interview round - When invoking `omx question` through attached-tmux Bash/tool paths, preserve the leader-pane return target by prefixing the command with `OMX_QUESTION_RETURN_PANE=$TMUX_PANE` (or a concrete `%pane` value) - If you launch `omx question` in a background terminal, immediately wait for that background terminal to finish and read its JSON answer before scoring ambiguity, asking another round, or handing off - Treat `answers[]` as the primary `omx question` success contract. For a single interview round, read `answers[0].answer`; use legacy top-level `answer` only as a compatibility fallback when needed. - If the current runtime is outside tmux and cannot render `omx question`, use the native structured question tool when available; otherwise ask exactly one concise plain-text question and wait for the answer - Re-score ambiguity after each answer and show progress transparently - Once ambiguity is at or below the active profile threshold, stop ordinary questioning. Run the practical closure audit: crystallize/handoff when readiness gates pass; otherwise ask only the final closure question needed to satisfy a named gate. - Treat `max_rounds` as a stop cap, not evidence that more rounds are needed. - Do not hand off to execution while ambiguity remains above threshold unless user explicitly opts to proceed with warning - Do not crystallize or hand off while `Non-goals` or `Decision Boundaries` remain unresolved, even if the weighted ambiguity threshold is met - Treat early exit as a safety valve, not the default success path - Persist mode state for resume safety with CLI-first state commands (`omx state write/read --input '' --json`); use `state_write` / `state_read` only when explicit MCP compatibility is enabled

Phase 0: Preflight Context Intake

  1. Parse {{ARGUMENTS}} and derive a short task slug.
  2. Attempt to load the latest relevant context snapshot from .omx/context/{slug}-*.md.
  3. Check whether the provided initial context or loaded snapshot is too large for safe prompt use. If it is oversized, the first interview round must ask for a concise prompt-safe summary instead of scoring ambiguity or continuing to downstream handoff.
  4. If no snapshot exists, create a minimum context snapshot with:
    • Task statement
    • Desired outcome
    • Stated solution (what the user asked for)
    • Probable intent hypothesis (why they likely want it)
    • Known facts/evidence
    • Constraints
    • Unknowns/open questions
    • Decision-boundary unknowns
    • Likely codebase touchpoints
    • Relevant repo docs/rules/context inspected
    • Terminology or doc/code conflicts found
    • Prompt-safe initial-context summary status (not_needed, needed, or recorded)
  5. For brownfield tasks, inspect the applicable documentation/rule surface before the first user-facing round. Prefer exact, nearby sources over broad scans:
    • governing AGENTS.md files and template/runtime instruction surfaces that apply to the touched paths
    • README/getting-started docs and relevant docs under docs/, especially contracts, plans, ADR-like records, and workflow docs
    • existing .omx/context/ snapshots, .omx/specs/, and planning artifacts relevant to the slug
    • project-local glossary/context files such as CONTEXT.md, CONTEXT-MAP.md, or context-specific docs when they exist
  6. Save snapshot to .omx/context/{slug}-{timestamp}.md (UTC YYYYMMDDTHHMMSSZ) and reference it in mode state.

Phase 1: Initialize

  1. Parse {{ARGUMENTS}} and depth profile (--quick|--standard|--deep).
  2. Detect project context:
    • Run explore to classify brownfield (existing codebase target) vs greenfield.
    • For brownfield, collect relevant codebase context before questioning.
  3. Initialize state via omx state write --input '{"mode":"deep-interview","active":true}' --json:
{
  "active": true,
  "current_phase": "deep-interview",
  "state": {
    "interview_id": "<uuid>",
    "profile": "quick|standard|deep",
    "type": "greenfield|brownfield",
    "initial_idea": "<user input>",
    "rounds": [],
    "current_ambiguity": 1.0,
    "threshold": 0.3,
    "max_rounds": 5,
    "challenge_modes_used": [],
    "codebase_context": null,
    "current_stage": "intent-first",
    "current_focus": "intent",
    "context_snapshot_path": ".omx/context/<slug>-<timestamp>.md"
  }
}
  1. Announce kickoff with profile, threshold, and current ambiguity.

Phase 2: Socratic Interview Loop

Repeat until ambiguity <= threshold, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.

2a) Generate next question

If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to $ultragoal, $ralplan, $autopilot, $ralph, or $team until that summary answer is captured.

Use:

  • Original idea
  • Prior Q&A rounds
  • Current dimension scores
  • Brownfield context (if any)
  • Doc/context grounding notes, including existing terminology, governing rules, and any doc/code mismatch
  • Activated challenge mode injection (Phase 3)

Target the lowest-scoring dimension, but respect stage priority:

  • Stage 1 — Intent-first: Intent, Outcome, Scope, Non-goals, Decision Boundaries
  • Stage 2 — Feasibility: Constraints, Success Criteria
  • Stage 3 — Brownfield grounding: Context Clarity (brownfield only)

Follow-up pressure ladder after each answer:

  1. Ask for a concrete example, counterexample, or evidence signal behind the latest claim
  2. Probe the hidden assumption, dependency, or belief that makes the claim true
  3. Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
  4. Challenge fuzzy or conflicting terms against the repo's documented language and current code behavior
  5. Stress-test the boundary with one concrete scenario or edge case when a relationship or handoff remains ambiguous
  6. If the answer still describes symptoms, reframe toward essence / root cause before moving on

Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.

Maintain a Breadth Ledger across independent ambiguity tracks: scope, constraints, outputs, verification, brownfield integration, and any user-mentioned deliverable tracks. The ledger is a guard, not a mandatory rotation rule: stay deep on the current thread until it has been pressure-tested, then zoom out only when another material track remains unresolved and would change execution.

Maintain a Docs/Terminology Ledger for brownfield interviews:

  • repo docs/rules/context sources inspected, with path references
  • canonical terms already used by the repo and terms to avoid or disambiguate
  • user terms that conflict with docs or current code behavior
  • doc/code mismatches that require a human decision before implementation
  • optional durable-doc follow-ups that are safe to propose but not auto-apply

Detailed dimensions:

  • Intent Clarity — why the user wants this
  • Outcome Clarity — what end state they want
  • Scope Clarity — how far the change should go
  • Constraint Clarity — technical or business limits that must hold
  • Success Criteria Clarity — how completion will be judged
  • Context Clarity — existing codebase understanding (brownfield only)

Non-goals and Decision Boundaries are mandatory readiness gates. Ask about them early and keep revisiting them until they are explicit.

2b) Ask the question

Use the surface-appropriate structured questioning path for every interview round. In attached-tmux sessions, use OMX-owned structured questioning via omx question (this is the required structured-question equivalent and required AskUserQuestion equivalent for deep-interview). Outside tmux, use native structured input when available; otherwise ask exactly one concise plain-text question and wait for the answer. Present:

Round {n} | Target: {weakest_dimension} | Ambiguity: {score}%

{question}

omx question payload guidance for interview rounds:

  • Deep-interview is Socratic: ask one focused round at a time. Do not use batch questions[] to combine multiple interview rounds, even though omx question supports batch forms for other workflows.
  • Use canonical type values instead of authoring raw multi_select flags by hand. type: "single-answerable" is the default for one-path decisions; type: "multi-answerable" is the canonical shape for bounded multi-select rounds. The runtime will keep multi_select aligned with type.
  • Use single-answerable when exactly one answer should drive the next branch, the options are mutually exclusive, or selecting more than one answer would blur the decision boundary. Typical cases: handoff lane selection, choosing the primary failure mode, or confirming which of several competing interpretations is correct.
  • Use multi-answerable when multiple options may all be true at once and you need to capture a bounded set of coexisting constraints, non-goals, risks, or acceptance checks in one round. Typical cases: selecting all out-of-scope items, all success metrics that must hold, or all deployment constraints that apply together.
  • If one selected option would immediately require a follow-up question to disambiguate the others, prefer a single-answerable round now and ask the follow-up next. Do not hide a branching interview tree inside one overloaded multi-select prompt.
  • Keep interview options bounded and concrete. If the valid answers are already known, set allow_other: false; only leave allow_other: true when the interview genuinely needs one user-supplied option that cannot be enumerated in advance.
  • Read answers structurally from the primary answers[] array. For a normal single-round interview response, use answers[0].answer as the source of truth; the top-level answer field is a legacy single-question projection/fallback only.
  • For single-answerable, expect one decisive selection in the value field of answers[0].answer plus its selected-values metadata. For multi-answerable, treat the selected-values field inside answers[0].answer as the source of truth for all chosen constraints/non-goals and preserve the full set in the transcript/spec. In legacy single-question projections, this is equivalent to: For multi-answerable, treat answer.selected_values as the source of truth.

Canonical bounded single-choice payload:

{
  "question": "Which execution lane should own this once the interview is complete?",
  "type": "single-answerable",
  "options": [
    {
      "label": "Plan first",
      "value": "ralplan",
      "description": "Need architecture and test-shape review before execution"
    },
    {
      "label": "Execute directly",
      "value": "autopilot",
      "description": "Requirements are already explicit enough for planning plus execution"
    },
    {
      "label": "Refine further",
      "value": "refine",
      "description": "Clarification is still needed before any handoff"
    }
  ],
  "allow_other": false,
  "other_label": "Other",
  "source": "deep-interview"
}

Canonical bounded multi-select payload:

{
  "question": "Which non-goals must stay out of scope for the first pass?",
  "type": "multi-answerable",
  "options": [
    {
      "label": "No UI redesign",
      "value": "no-ui-redesign",
      "description": "Keep layout and styling unchanged"
    },
    {
      "label": "No new dependencies",
      "value": "no-new-dependencies",
      "description": "Work within the existing toolchain"
    },
    {
      "label": "No API contract changes",
      "value": "no-api-contract-changes",
      "description": "Preserve external request and response shapes"
    }
  ],
  "allow_other": false,
  "other_label": "Other",
  "source": "deep-interview"
}

Canonical answer-shape reminders:

{
  "answer": {
    "kind": "option",
    "value": "ralplan",
    "selected_labels": ["Plan first"],
    "selected_values": ["ralplan"]
  }
}
{
  "answer": {
    "kind": "multi",
    "value": ["no-new-dependencies", "no-api-contract-changes"],
    "selected_labels": ["No new dependencies", "No API contract changes"],
    "selected_values": ["no-new-dependencies", "no-api-contract-changes"]
  }
}

2c) Score ambiguity

Score each weighted dimension in [0.0, 1.0] with justification + gap.

Greenfield: ambiguity = 1 - (intent × 0.30 + outcome × 0.25 + scope × 0.20 + constraints × 0.15 + success × 0.10)

Brownfield: ambiguity = 1 - (intent × 0.25 + outcome × 0.20 + scope × 0.20 + constraints × 0.15 + success × 0.10 + context × 0.10)

Readiness gate:

  • Non-goals must be explicit
  • Decision Boundaries must be explicit
  • A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
  • A practical closure audit must pass: another question would change execution materially, not merely polish wording or chase a narrow edge case
  • If either gate is unresolved, or the pressure pass is incomplete, continue below threshold only with a final closure question that names the unresolved gate and would materially change execution.
  • Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec instead of opening a new branch.
  • If ambiguity is <= 0.10, another user-facing question is allowed only as that final closure question; otherwise crystallize immediately.

2d) Report progress

Show weighted breakdown table, readiness-gate status (Non-goals, Decision Boundaries), and the next focus dimension.

2e) Persist state

Append round result and updated scores via omx state write --input '<json>' --json; use state_write only when explicit MCP compatibility is enabled.

2f) Round controls

  • Do not offer early exit before the first explicit assumption probe and one persistent follow-up have happened
  • Apply a Dialectic Rhythm Guard: track consecutive non-user fact discoveries and confirmation-style answers ([from-code][auto-confirmed], [from-code], or [from-research]). After 3 consecutive non-user or confirmation answers, the next material user-facing round must solicit direct human judgment ([from-user]) unless the closure audit says the interview is ready to crystallize.
  • Round 4+: allow explicit early exit with risk warning
  • Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
  • Hard cap at profile max_rounds; never treat this cap as a desired interview length or quota

Phase 3: Challenge Modes (assumption stress tests)

Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:

  • Contrarian (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions
  • Terminologist (brownfield, whenever a key term is fuzzy, overloaded, or conflicts with repo docs/code): force a canonical meaning against existing project language before implementation
  • Simplifier (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope
  • Ontologist (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing

Track used modes in state to prevent repetition.

Phase 4: Crystallize Artifacts

When threshold is met (or user exits with warning / hard cap):

  1. Write interview transcript summary to:
    • .omx/interviews/{slug}-{timestamp}.md
      (kept for ralph PRD compatibility)
  2. Write execution-ready spec to:
    • .omx/specs/deep-interview-{slug}.md

Spec should include:

  • Metadata (profile, rounds, final ambiguity, threshold, context type)
  • Context snapshot reference/path (for ralplan/team reuse)
  • Prompt-safe initial-context summary when oversized context was provided, plus references to any full source documents
  • Clarity breakdown table
  • Intent (why the user wants this)
  • Desired Outcome
  • In-Scope
  • Out-of-Scope / Non-goals
  • Decision Boundaries (what OMX may decide without confirmation)
  • Constraints
  • Testable acceptance criteria
  • Assumptions exposed + resolutions
  • Pressure-pass findings (which answer was revisited, and what changed)
  • Brownfield evidence vs inference notes for any repository-grounded confirmation questions
  • Docs/Terminology Ledger with inspected repo docs/rules/context, term conflicts, and any doc/code mismatch decisions
  • Scenario/edge-case pressure findings that materially shaped scope or acceptance criteria
  • Optional durable documentation recommendations, explicitly marked opt-in and public-safe; do not include raw private transcript dumps
  • Technical context findings
  • Full or condensed transcript

Autoresearch specialization

When the clarified task is specifically about $autoresearch, or the skill is invoked with --autoresearch, keep the interview domain-specific and emit skill-consumable artifacts without skipping clarification.

  • Accepted seed inputs: topic, evaluator, keep-policy, slug, existing mission draft text, and prior evaluator examples/templates
  • Required interview focus: mission clarity, evaluator readiness, keep policy, slug/session naming, and whether the draft is ready to launch now or should refine further
  • Canonical artifact path: .omx/specs/deep-interview-autoresearch-{slug}.md
  • Launch artifact bundle: .omx/specs/autoresearch-{slug}/mission.md, .omx/specs/autoresearch-{slug}/sandbox.md, and .omx/specs/autoresearch-{slug}/result.json
  • Launch artifact directory: .omx/specs/autoresearch-{slug}/
  • Required artifact sections:
    • Mission Draft
    • Evaluator Draft
    • Launch Readiness
    • Seed Inputs
    • Confirmation Bridge
  • Required launch artifacts under .omx/specs/autoresearch-{slug}/:
    • mission.md
    • sandbox.md
    • result.json
  • Launch-readiness rule: mark the draft as not launch-ready while the evaluator command still contains placeholder markers such as <...>, TODO, TBD, REPLACE_ME, CHANGEME, or your-command-here
  • Structured result contract: result.json should point to the draft + mission/sandbox artifacts and carry the finalized topic, evaluatorCommand, keepPolicy, slug, launchReady, and blockedReasons fields so $autoresearch can consume it directly
  • Confirmation bridge: after artifact generation, offer at least refine further and launch; do not run direct CLI launch or detached/split tmux launch, and only hand off to $autoresearch after explicit confirmation
  • Handoff rule: downstream execution must preserve the clarified mission intent, evaluator expectations, decision boundaries, and launch-readiness status from this artifact rather than bypassing the draft review step

Phase 5: Execution Bridge

Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, and any residual-risk warnings across the handoff.

Optional execution contract foundation

When an Autopilot/deep-interview handoff explicitly requires a stride contract, emit it as structured data rather than prose. This is a validation foundation, not a broadness-inference feature: do not infer stride from task length, phase labels, snapshots, or freeform wording.

Canonical location under Autopilot state:

{
  "handoff_artifacts": {
    "deep_interview": {
      "execution_contract_required": true,
      "execution_contract": {
        "version": 1,
        "execution_stride": "task",
        "source": "deep-interview",
        "selected_by": "user",
        "allow_task_shrink": true,
        "completion_unit": "One focused task",
        "stop_condition": "Stop after that task is implemented and verified",
        "acceptance_coverage_scope": "task",
        "shrink_policy": "allowed"
      }
    }
  }
}

Stride meanings:

  • task: conservative, small-step execution; allow_task_shrink:true, acceptance_coverage_scope:"task", shrink_policy:"allowed".
  • deliverable: finish the named deliverable before stopping; allow_task_shrink:false, acceptance_coverage_scope:"deliverable", shrink_policy:"ask_before_shrink".
  • milestone: finish the larger approved milestone unless blocked; allow_task_shrink:false, acceptance_coverage_scope:"milestone", shrink_policy:"deny_unless_blocked".

Only set execution_contract_required:true when the selected downstream workflow needs this explicit stride/stop-condition guard. New artifacts must write the canonical snake_case schema shown above under handoff_artifacts.deep_interview; runtime readers may accept legacy camelCase field/marker aliases and direct/nested execution_contract locations only as compatibility input. If execution_contract_required is absent or false, downstream Autopilot compatibility behavior is unchanged.

Goal-mode follow-ups

Include these product-facing suggestions when they fit the clarified spec, without removing the existing $ultragoal, $ralplan, $autopilot, $ralph, and $team handoff options:

  • $ultragoal — default goal-mode follow-up for implementation or general goal-oriented follow-up specs that should be converted into durable Codex/OMX goals with sequential completion tracking.
  • $autoresearch-goal — use when the clarified context is a research project: a research question, reference/literature gathering, evaluator-backed analysis, or professor/critic-style deliverable.
  • $performance-goal — use when the clarified context is an optimization or performance project with measurable speed, latency, throughput, memory, benchmark, or evaluator criteria.

Recommend $ultragoal as the default durable goal-mode follow-up because it supersedes Ralph for goal tracking. Preserve $team for coordinated parallel implementation and keep $ralph only as an explicit fallback for persistent single-owner execution/verification when the user specifically selects it.

1. $ultragoal (Default durable execution follow-up)

  • Input Artifact: .omx/specs/deep-interview-{slug}.md (optionally accompanied by the transcript/context snapshot for traceability)
  • Invocation: $ultragoal create-goals --brief-file <spec-path> followed by $ultragoal complete-goals in the active execution lane
  • Consumer Behavior: Convert the clarified spec into durable goal-mode work. Preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, scenario-pressure findings, and residual-risk warnings as binding story constraints.
  • Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, doc/context preflight, and early intent-boundary elicitation
  • Expected Output: .omx/ultragoal/brief.md, .omx/ultragoal/goals.json, .omx/ultragoal/ledger.jsonl, implementation evidence, verification evidence, and final cleanup/review-gate evidence
  • Best When: The clarified spec is execution-ready or the user explicitly wants durable goal tracking as the next step
  • Next Recommended Step: Run the Ultragoal completion loop; launch $team only inside an active Ultragoal story when parallel lanes are warranted, and use $ralph only as an explicit fallback when the user asks for that legacy persistence mode

2. $ralplan (Recommended when architecture/test-shape review is still needed)

  • Input Artifact: .omx/specs/deep-interview-{slug}.md (optionally accompanied by the transcript/context snapshot for traceability)
  • Invocation: $plan --consensus --direct <spec-path>
  • Consumer Behavior: Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.
  • Skipped / Already-Satisfied Stages: Requirements discovery, ambiguity clarification, and early intent-boundary elicitation
  • Expected Output: Canonical planning artifacts under .omx/plans/, especially prd-*.md and test-spec-*.md
  • Best When: Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable
  • Next Recommended Step: Use the approved planning artifacts with $ultragoal as the default durable goal-mode follow-up (optionally with $team for parallel lanes); choose $autoresearch-goal for research validation or $performance-goal for measurable optimization, and use $ralph only as an explicit fallback when a narrow single-owner persistence loop is requested

3. $autopilot

  • Input Artifact: .omx/specs/deep-interview-{slug}.md
  • Invocation: $autopilot <spec-path>
  • Consumer Behavior: Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.
  • Skipped / Already-Satisfied Stages: Initial requirement discovery and ambiguity reduction
  • Expected Output: Planning/execution progress, QA evidence, and validation artifacts produced by autopilot
  • Best When: The clarified spec is already strong enough for direct planning + execution without an additional consensus gate
  • Next Recommended Step: Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer $team under a leader-owned $ultragoal ledger, using $ralph only as an explicit fallback when a narrow single-owner persistence loop is requested

4. $ralph (Explicit fallback only)

  • Input Artifact: .omx/specs/deep-interview-{slug}.md
  • Invocation: $ralph <spec-path>
  • Consumer Behavior: Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.
  • Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, and initial scope-definition work
  • Expected Output: Iterative execution progress and verification evidence tracked against the clarified criteria
  • Best When: The user explicitly asks for Ralph's persistent sequential completion pressure; otherwise use $ultragoal for durable goal tracking and completion checkpoints
  • Next Recommended Step: If this explicit fallback is selected, continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to $team under $ultragoal checkpointing rather than promoting Ralph as the next default

5. $team

  • Input Artifact: .omx/specs/deep-interview-{slug}.md
  • Invocation: $team <spec-path>
  • Consumer Behavior: Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.
  • Skipped / Already-Satisfied Stages: Requirement clarification and early ambiguity reduction
  • Expected Output: Coordinated multi-agent execution against the shared spec, with evidence that can later feed Ultragoal checkpoints by default, or an explicit Ralph verification pass only when requested
  • Best When: The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop
  • Next Recommended Step: Follow the team verification path when the coordinated execution phase finishes; checkpoint completion through $ultragoal by default, escalating to a separate Ralph loop only when the user explicitly asks for that persistent verification/fix owner

6. Refine further

  • Input Artifact: Existing transcript, context snapshot, and current spec draft
  • Invocation: Continue the interview loop
  • Consumer Behavior: Re-enter questioning to resolve the highest-leverage remaining uncertainty
  • Skipped / Already-Satisfied Stages: None beyond already-captured context
  • Expected Output: A lower-ambiguity spec with tighter boundaries and fewer unresolved assumptions
  • Best When: Residual ambiguity is still too high, the user wants stronger clarity, or the above-threshold / early-exit warning indicates too much risk to proceed cleanly
  • Next Recommended Step: Return to one of the execution handoff contracts above once the spec is sufficiently clarified

Residual-Risk Rule: If the interview ended via early exit, hard-cap completion, or above-threshold proceed-with-warning, explicitly preserve that residual-risk state in the handoff so the downstream skill knows it inherited a partially clarified brief.

IMPORTANT: Deep-interview is a requirements mode. On handoff, invoke the selected skill using the contract above. Do NOT implement directly inside deep-interview.

- Use `explore` for codebase fact gathering - Use `omx question` as the OMX-native structured user-input tool for each interview round when an attached tmux renderer is available - From attached-tmux Bash/tool paths, call it as `OMX_QUESTION_RETURN_PANE=$TMUX_PANE omx question ...` unless an explicit `%pane` return target is already known - If the current runtime is outside tmux and cannot render `omx question`, use native structured input when available; otherwise ask exactly one concise plain-text question and wait for the answer - After `omx question` returns JSON, prefer `answers[0].answer` / `answers[]`; use legacy `answer` only as a fallback for older records - Use `omx state write/read --input '' --json` for resumable mode state; `state_write` / `state_read` are explicit MCP compatibility fallbacks only - If the interview cannot ask a required `omx question` round, persist the blocker as terminal state with `active: false` and `current_phase: "blocked"`; do not write a terminal blocked phase with `active: true` - Read/write context snapshots under `.omx/context/` - Read applicable repo docs/rules/context during preflight; write durable docs, glossary, ADR, or memory updates only when the user explicitly opts in and the content is public-safe - Record whether the oversized-context summary gate is not needed, pending, or satisfied before any scoring or handoff step - Save transcript/spec artifacts under `.omx/interviews/` and `.omx/specs/` - User says stop/cancel/abort -> persist state and stop - Ambiguity stalls for 3 rounds (+/- 0.05) -> force Ontologist mode once - Max rounds reached -> proceed with explicit residual-risk warning - All dimensions >= 0.9 -> allow early crystallization even before max rounds - [ ] Preflight context snapshot exists under `.omx/context/{slug}-{timestamp}.md` - [ ] Oversized initial context, if present, has a prompt-safe summary recorded before ambiguity scoring or downstream handoff - [ ] Ambiguity score shown each round - [ ] Intent-first stage priority used before implementation detail - [ ] Weakest-dimension targeting used within the active stage - [ ] At least one explicit assumption probe happened before crystallization - [ ] At least one persistent follow-up / pressure pass deepened a prior answer - [ ] Challenge modes triggered at thresholds (when applicable) - [ ] Transcript written to `.omx/interviews/{slug}-{timestamp}.md` - [ ] Spec written to `.omx/specs/deep-interview-{slug}.md` - [ ] Brownfield questions use evidence-backed confirmation when applicable - [ ] Brownfield preflight inspected applicable repo docs/rules/context before user-facing questions - [ ] Fuzzy or conflicting terminology was challenged against repo language/current code behavior when applicable - [ ] Scenario-based edge-case grilling was used when boundary ambiguity would materially affect implementation - [ ] Durable docs/ADR/memory updates, if any, were explicitly opted into and public-safe - [ ] Handoff options provided (`$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, `$team`) plus context-sensitive goal-mode suggestions (`$autoresearch-goal`, `$performance-goal`) when applicable - [ ] No direct implementation performed in this mode ## Suggested Config (optional)

Deep-interview reads runtime defaults from the first existing config source in this order:

  1. Repository-local .omx/config.toml
  2. Repository-root omx.toml
  3. User-global ~/.omx/config.toml

This section is currently a deep-interview-specific runtime override surface, not a general replacement for Codex config.toml or .omx-config.json model/env routing. Malformed config files are ignored fail-soft so $deep-interview activation can continue with built-in defaults. Explicit --quick, --standard, or --deep invocation flags override defaultProfile.

[omx.deepInterview]
defaultProfile = "standard"
quickThreshold = 0.30
standardThreshold = 0.20
deepThreshold = 0.15
quickMaxRounds = 5
standardMaxRounds = 12
deepMaxRounds = 20
enableChallengeModes = true

Resume

If interrupted, rerun $deep-interview. Resume from persisted mode state via omx state read --input '{"mode":"deep-interview"}' --json.

Recommended 3-Stage Pipeline

deep-interview -> ralplan -> autopilot
  • Stage 1 (deep-interview): clarity gate
  • Stage 2 (ralplan): feasibility + architecture gate
  • Stage 3 (autopilot): execution + QA + validation gate
Install via CLI
npx skills add https://github.com/Yeachan-Heo/oh-my-codex --skill deep-interview
Repository Details
star Stars 31,009
call_split Forks 2,436
navigation Branch main
article Path SKILL.md
More from Creator