name: deep-interview
description: Socratic deep interview with mathematical ambiguity gating before execution
argument-hint: "[--quick|--standard|--deep] [--autoresearch] "
Profile max rounds is a hard cap, not a target. Do not continue only to reach a numbered round count. Extra Socratic rigor does not override the active threshold unless the profile/config changes.
If no flag is provided, use Standard.
Phase 0: Preflight Context Intake
- Parse
{{ARGUMENTS}}and derive a short task slug. - Attempt to load the latest relevant context snapshot from
.omx/context/{slug}-*.md. - Check whether the provided initial context or loaded snapshot is too large for safe prompt use. If it is oversized, the first interview round must ask for a concise prompt-safe summary instead of scoring ambiguity or continuing to downstream handoff.
- If no snapshot exists, create a minimum context snapshot with:
- Task statement
- Desired outcome
- Stated solution (what the user asked for)
- Probable intent hypothesis (why they likely want it)
- Known facts/evidence
- Constraints
- Unknowns/open questions
- Decision-boundary unknowns
- Likely codebase touchpoints
- Relevant repo docs/rules/context inspected
- Terminology or doc/code conflicts found
- Prompt-safe initial-context summary status (
not_needed,needed, orrecorded)
- For brownfield tasks, inspect the applicable documentation/rule surface before the first user-facing round. Prefer exact, nearby sources over broad scans:
- governing
AGENTS.mdfiles and template/runtime instruction surfaces that apply to the touched paths - README/getting-started docs and relevant docs under
docs/, especially contracts, plans, ADR-like records, and workflow docs - existing
.omx/context/snapshots,.omx/specs/, and planning artifacts relevant to the slug - project-local glossary/context files such as
CONTEXT.md,CONTEXT-MAP.md, or context-specific docs when they exist
- governing
- Save snapshot to
.omx/context/{slug}-{timestamp}.md(UTCYYYYMMDDTHHMMSSZ) and reference it in mode state.
Phase 1: Initialize
- Parse
{{ARGUMENTS}}and depth profile (--quick|--standard|--deep). - Detect project context:
- Run
exploreto classify brownfield (existing codebase target) vs greenfield. - For brownfield, collect relevant codebase context before questioning.
- Run
- Initialize state via
omx state write --input '{"mode":"deep-interview","active":true}' --json:
{
"active": true,
"current_phase": "deep-interview",
"state": {
"interview_id": "<uuid>",
"profile": "quick|standard|deep",
"type": "greenfield|brownfield",
"initial_idea": "<user input>",
"rounds": [],
"current_ambiguity": 1.0,
"threshold": 0.3,
"max_rounds": 5,
"challenge_modes_used": [],
"codebase_context": null,
"current_stage": "intent-first",
"current_focus": "intent",
"context_snapshot_path": ".omx/context/<slug>-<timestamp>.md"
}
}
- Announce kickoff with profile, threshold, and current ambiguity.
Phase 2: Socratic Interview Loop
Repeat until ambiguity <= threshold, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.
2a) Generate next question
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to $ultragoal, $ralplan, $autopilot, $ralph, or $team until that summary answer is captured.
Use:
- Original idea
- Prior Q&A rounds
- Current dimension scores
- Brownfield context (if any)
- Doc/context grounding notes, including existing terminology, governing rules, and any doc/code mismatch
- Activated challenge mode injection (Phase 3)
Target the lowest-scoring dimension, but respect stage priority:
- Stage 1 — Intent-first: Intent, Outcome, Scope, Non-goals, Decision Boundaries
- Stage 2 — Feasibility: Constraints, Success Criteria
- Stage 3 — Brownfield grounding: Context Clarity (brownfield only)
Follow-up pressure ladder after each answer:
- Ask for a concrete example, counterexample, or evidence signal behind the latest claim
- Probe the hidden assumption, dependency, or belief that makes the claim true
- Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
- Challenge fuzzy or conflicting terms against the repo's documented language and current code behavior
- Stress-test the boundary with one concrete scenario or edge case when a relationship or handoff remains ambiguous
- If the answer still describes symptoms, reframe toward essence / root cause before moving on
Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.
Maintain a Breadth Ledger across independent ambiguity tracks: scope, constraints, outputs, verification, brownfield integration, and any user-mentioned deliverable tracks. The ledger is a guard, not a mandatory rotation rule: stay deep on the current thread until it has been pressure-tested, then zoom out only when another material track remains unresolved and would change execution.
Maintain a Docs/Terminology Ledger for brownfield interviews:
- repo docs/rules/context sources inspected, with path references
- canonical terms already used by the repo and terms to avoid or disambiguate
- user terms that conflict with docs or current code behavior
- doc/code mismatches that require a human decision before implementation
- optional durable-doc follow-ups that are safe to propose but not auto-apply
Detailed dimensions:
- Intent Clarity — why the user wants this
- Outcome Clarity — what end state they want
- Scope Clarity — how far the change should go
- Constraint Clarity — technical or business limits that must hold
- Success Criteria Clarity — how completion will be judged
- Context Clarity — existing codebase understanding (brownfield only)
Non-goals and Decision Boundaries are mandatory readiness gates. Ask about them early and keep revisiting them until they are explicit.
2b) Ask the question
Use the surface-appropriate structured questioning path for every interview round. In attached-tmux sessions, use OMX-owned structured questioning via omx question (this is the required structured-question equivalent and required AskUserQuestion equivalent for deep-interview). Outside tmux, use native structured input when available; otherwise ask exactly one concise plain-text question and wait for the answer. Present:
Round {n} | Target: {weakest_dimension} | Ambiguity: {score}%
{question}
omx question payload guidance for interview rounds:
- Deep-interview is Socratic: ask one focused round at a time. Do not use batch
questions[]to combine multiple interview rounds, even thoughomx questionsupports batch forms for other workflows. - Use canonical
typevalues instead of authoring rawmulti_selectflags by hand.type: "single-answerable"is the default for one-path decisions;type: "multi-answerable"is the canonical shape for bounded multi-select rounds. The runtime will keepmulti_selectaligned withtype. - Use
single-answerablewhen exactly one answer should drive the next branch, the options are mutually exclusive, or selecting more than one answer would blur the decision boundary. Typical cases: handoff lane selection, choosing the primary failure mode, or confirming which of several competing interpretations is correct. - Use
multi-answerablewhen multiple options may all be true at once and you need to capture a bounded set of coexisting constraints, non-goals, risks, or acceptance checks in one round. Typical cases: selecting all out-of-scope items, all success metrics that must hold, or all deployment constraints that apply together. - If one selected option would immediately require a follow-up question to disambiguate the others, prefer a
single-answerableround now and ask the follow-up next. Do not hide a branching interview tree inside one overloaded multi-select prompt. - Keep interview options bounded and concrete. If the valid answers are already known, set
allow_other: false; only leaveallow_other: truewhen the interview genuinely needs one user-supplied option that cannot be enumerated in advance. - Read answers structurally from the primary
answers[]array. For a normal single-round interview response, useanswers[0].answeras the source of truth; the top-levelanswerfield is a legacy single-question projection/fallback only. - For
single-answerable, expect one decisive selection in thevaluefield ofanswers[0].answerplus its selected-values metadata. Formulti-answerable, treat the selected-values field insideanswers[0].answeras the source of truth for all chosen constraints/non-goals and preserve the full set in the transcript/spec. In legacy single-question projections, this is equivalent to: Formulti-answerable, treatanswer.selected_valuesas the source of truth.
Canonical bounded single-choice payload:
{
"question": "Which execution lane should own this once the interview is complete?",
"type": "single-answerable",
"options": [
{
"label": "Plan first",
"value": "ralplan",
"description": "Need architecture and test-shape review before execution"
},
{
"label": "Execute directly",
"value": "autopilot",
"description": "Requirements are already explicit enough for planning plus execution"
},
{
"label": "Refine further",
"value": "refine",
"description": "Clarification is still needed before any handoff"
}
],
"allow_other": false,
"other_label": "Other",
"source": "deep-interview"
}
Canonical bounded multi-select payload:
{
"question": "Which non-goals must stay out of scope for the first pass?",
"type": "multi-answerable",
"options": [
{
"label": "No UI redesign",
"value": "no-ui-redesign",
"description": "Keep layout and styling unchanged"
},
{
"label": "No new dependencies",
"value": "no-new-dependencies",
"description": "Work within the existing toolchain"
},
{
"label": "No API contract changes",
"value": "no-api-contract-changes",
"description": "Preserve external request and response shapes"
}
],
"allow_other": false,
"other_label": "Other",
"source": "deep-interview"
}
Canonical answer-shape reminders:
{
"answer": {
"kind": "option",
"value": "ralplan",
"selected_labels": ["Plan first"],
"selected_values": ["ralplan"]
}
}
{
"answer": {
"kind": "multi",
"value": ["no-new-dependencies", "no-api-contract-changes"],
"selected_labels": ["No new dependencies", "No API contract changes"],
"selected_values": ["no-new-dependencies", "no-api-contract-changes"]
}
}
2c) Score ambiguity
Score each weighted dimension in [0.0, 1.0] with justification + gap.
Greenfield: ambiguity = 1 - (intent × 0.30 + outcome × 0.25 + scope × 0.20 + constraints × 0.15 + success × 0.10)
Brownfield: ambiguity = 1 - (intent × 0.25 + outcome × 0.20 + scope × 0.20 + constraints × 0.15 + success × 0.10 + context × 0.10)
Readiness gate:
Non-goalsmust be explicitDecision Boundariesmust be explicit- A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
- A practical closure audit must pass: another question would change execution materially, not merely polish wording or chase a narrow edge case
- If either gate is unresolved, or the pressure pass is incomplete, continue below threshold only with a final closure question that names the unresolved gate and would materially change execution.
- Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec instead of opening a new branch.
- If ambiguity is
<= 0.10, another user-facing question is allowed only as that final closure question; otherwise crystallize immediately.
2d) Report progress
Show weighted breakdown table, readiness-gate status (Non-goals, Decision Boundaries), and the next focus dimension.
2e) Persist state
Append round result and updated scores via omx state write --input '<json>' --json; use state_write only when explicit MCP compatibility is enabled.
2f) Round controls
- Do not offer early exit before the first explicit assumption probe and one persistent follow-up have happened
- Apply a Dialectic Rhythm Guard: track consecutive non-user fact discoveries and confirmation-style answers (
[from-code][auto-confirmed],[from-code], or[from-research]). After 3 consecutive non-user or confirmation answers, the next material user-facing round must solicit direct human judgment ([from-user]) unless the closure audit says the interview is ready to crystallize. - Round 4+: allow explicit early exit with risk warning
- Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
- Hard cap at profile
max_rounds; never treat this cap as a desired interview length or quota
Phase 3: Challenge Modes (assumption stress tests)
Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:
- Contrarian (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions
- Terminologist (brownfield, whenever a key term is fuzzy, overloaded, or conflicts with repo docs/code): force a canonical meaning against existing project language before implementation
- Simplifier (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope
- Ontologist (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing
Track used modes in state to prevent repetition.
Phase 4: Crystallize Artifacts
When threshold is met (or user exits with warning / hard cap):
- Write interview transcript summary to:
.omx/interviews/{slug}-{timestamp}.md
(kept for ralph PRD compatibility)
- Write execution-ready spec to:
.omx/specs/deep-interview-{slug}.md
Spec should include:
- Metadata (profile, rounds, final ambiguity, threshold, context type)
- Context snapshot reference/path (for ralplan/team reuse)
- Prompt-safe initial-context summary when oversized context was provided, plus references to any full source documents
- Clarity breakdown table
- Intent (why the user wants this)
- Desired Outcome
- In-Scope
- Out-of-Scope / Non-goals
- Decision Boundaries (what OMX may decide without confirmation)
- Constraints
- Testable acceptance criteria
- Assumptions exposed + resolutions
- Pressure-pass findings (which answer was revisited, and what changed)
- Brownfield evidence vs inference notes for any repository-grounded confirmation questions
- Docs/Terminology Ledger with inspected repo docs/rules/context, term conflicts, and any doc/code mismatch decisions
- Scenario/edge-case pressure findings that materially shaped scope or acceptance criteria
- Optional durable documentation recommendations, explicitly marked opt-in and public-safe; do not include raw private transcript dumps
- Technical context findings
- Full or condensed transcript
Autoresearch specialization
When the clarified task is specifically about $autoresearch, or the skill is invoked with --autoresearch, keep the interview domain-specific and emit skill-consumable artifacts without skipping clarification.
- Accepted seed inputs:
topic,evaluator,keep-policy,slug, existing mission draft text, and prior evaluator examples/templates - Required interview focus: mission clarity, evaluator readiness, keep policy, slug/session naming, and whether the draft is ready to launch now or should refine further
- Canonical artifact path:
.omx/specs/deep-interview-autoresearch-{slug}.md - Launch artifact bundle:
.omx/specs/autoresearch-{slug}/mission.md,.omx/specs/autoresearch-{slug}/sandbox.md, and.omx/specs/autoresearch-{slug}/result.json - Launch artifact directory:
.omx/specs/autoresearch-{slug}/ - Required artifact sections:
Mission DraftEvaluator DraftLaunch ReadinessSeed InputsConfirmation Bridge
- Required launch artifacts under
.omx/specs/autoresearch-{slug}/:mission.mdsandbox.mdresult.json
- Launch-readiness rule: mark the draft as not launch-ready while the evaluator command still contains placeholder markers such as
<...>,TODO,TBD,REPLACE_ME,CHANGEME, oryour-command-here - Structured result contract:
result.jsonshould point to the draft + mission/sandbox artifacts and carry the finalizedtopic,evaluatorCommand,keepPolicy,slug,launchReady, andblockedReasonsfields so$autoresearchcan consume it directly - Confirmation bridge: after artifact generation, offer at least
refine furtherandlaunch; do not run direct CLI launch or detached/split tmux launch, and only hand off to$autoresearchafter explicit confirmation - Handoff rule: downstream execution must preserve the clarified mission intent, evaluator expectations, decision boundaries, and launch-readiness status from this artifact rather than bypassing the draft review step
Phase 5: Execution Bridge
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, and any residual-risk warnings across the handoff.
Optional execution contract foundation
When an Autopilot/deep-interview handoff explicitly requires a stride contract, emit it as structured data rather than prose. This is a validation foundation, not a broadness-inference feature: do not infer stride from task length, phase labels, snapshots, or freeform wording.
Canonical location under Autopilot state:
{
"handoff_artifacts": {
"deep_interview": {
"execution_contract_required": true,
"execution_contract": {
"version": 1,
"execution_stride": "task",
"source": "deep-interview",
"selected_by": "user",
"allow_task_shrink": true,
"completion_unit": "One focused task",
"stop_condition": "Stop after that task is implemented and verified",
"acceptance_coverage_scope": "task",
"shrink_policy": "allowed"
}
}
}
}
Stride meanings:
task: conservative, small-step execution;allow_task_shrink:true,acceptance_coverage_scope:"task",shrink_policy:"allowed".deliverable: finish the named deliverable before stopping;allow_task_shrink:false,acceptance_coverage_scope:"deliverable",shrink_policy:"ask_before_shrink".milestone: finish the larger approved milestone unless blocked;allow_task_shrink:false,acceptance_coverage_scope:"milestone",shrink_policy:"deny_unless_blocked".
Only set execution_contract_required:true when the selected downstream workflow needs this explicit stride/stop-condition guard. New artifacts must write the canonical snake_case schema shown above under handoff_artifacts.deep_interview; runtime readers may accept legacy camelCase field/marker aliases and direct/nested execution_contract locations only as compatibility input. If execution_contract_required is absent or false, downstream Autopilot compatibility behavior is unchanged.
Goal-mode follow-ups
Include these product-facing suggestions when they fit the clarified spec, without removing the existing $ultragoal, $ralplan, $autopilot, $ralph, and $team handoff options:
$ultragoal— default goal-mode follow-up for implementation or general goal-oriented follow-up specs that should be converted into durable Codex/OMX goals with sequential completion tracking.$autoresearch-goal— use when the clarified context is a research project: a research question, reference/literature gathering, evaluator-backed analysis, or professor/critic-style deliverable.$performance-goal— use when the clarified context is an optimization or performance project with measurable speed, latency, throughput, memory, benchmark, or evaluator criteria.
Recommend $ultragoal as the default durable goal-mode follow-up because it supersedes Ralph for goal tracking. Preserve $team for coordinated parallel implementation and keep $ralph only as an explicit fallback for persistent single-owner execution/verification when the user specifically selects it.
1. $ultragoal (Default durable execution follow-up)
- Input Artifact:
.omx/specs/deep-interview-{slug}.md(optionally accompanied by the transcript/context snapshot for traceability) - Invocation:
$ultragoal create-goals --brief-file <spec-path>followed by$ultragoal complete-goalsin the active execution lane - Consumer Behavior: Convert the clarified spec into durable goal-mode work. Preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, scenario-pressure findings, and residual-risk warnings as binding story constraints.
- Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, doc/context preflight, and early intent-boundary elicitation
- Expected Output:
.omx/ultragoal/brief.md,.omx/ultragoal/goals.json,.omx/ultragoal/ledger.jsonl, implementation evidence, verification evidence, and final cleanup/review-gate evidence - Best When: The clarified spec is execution-ready or the user explicitly wants durable goal tracking as the next step
- Next Recommended Step: Run the Ultragoal completion loop; launch
$teamonly inside an active Ultragoal story when parallel lanes are warranted, and use$ralphonly as an explicit fallback when the user asks for that legacy persistence mode
2. $ralplan (Recommended when architecture/test-shape review is still needed)
- Input Artifact:
.omx/specs/deep-interview-{slug}.md(optionally accompanied by the transcript/context snapshot for traceability) - Invocation:
$plan --consensus --direct <spec-path> - Consumer Behavior: Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.
- Skipped / Already-Satisfied Stages: Requirements discovery, ambiguity clarification, and early intent-boundary elicitation
- Expected Output: Canonical planning artifacts under
.omx/plans/, especiallyprd-*.mdandtest-spec-*.md - Best When: Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable
- Next Recommended Step: Use the approved planning artifacts with
$ultragoalas the default durable goal-mode follow-up (optionally with$teamfor parallel lanes); choose$autoresearch-goalfor research validation or$performance-goalfor measurable optimization, and use$ralphonly as an explicit fallback when a narrow single-owner persistence loop is requested
3. $autopilot
- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$autopilot <spec-path> - Consumer Behavior: Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.
- Skipped / Already-Satisfied Stages: Initial requirement discovery and ambiguity reduction
- Expected Output: Planning/execution progress, QA evidence, and validation artifacts produced by autopilot
- Best When: The clarified spec is already strong enough for direct planning + execution without an additional consensus gate
- Next Recommended Step: Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer
$teamunder a leader-owned$ultragoalledger, using$ralphonly as an explicit fallback when a narrow single-owner persistence loop is requested
4. $ralph (Explicit fallback only)
- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$ralph <spec-path> - Consumer Behavior: Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.
- Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, and initial scope-definition work
- Expected Output: Iterative execution progress and verification evidence tracked against the clarified criteria
- Best When: The user explicitly asks for Ralph's persistent sequential completion pressure; otherwise use
$ultragoalfor durable goal tracking and completion checkpoints - Next Recommended Step: If this explicit fallback is selected, continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to
$teamunder$ultragoalcheckpointing rather than promoting Ralph as the next default
5. $team
- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$team <spec-path> - Consumer Behavior: Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.
- Skipped / Already-Satisfied Stages: Requirement clarification and early ambiguity reduction
- Expected Output: Coordinated multi-agent execution against the shared spec, with evidence that can later feed Ultragoal checkpoints by default, or an explicit Ralph verification pass only when requested
- Best When: The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop
- Next Recommended Step: Follow the team verification path when the coordinated execution phase finishes; checkpoint completion through
$ultragoalby default, escalating to a separate Ralph loop only when the user explicitly asks for that persistent verification/fix owner
6. Refine further
- Input Artifact: Existing transcript, context snapshot, and current spec draft
- Invocation: Continue the interview loop
- Consumer Behavior: Re-enter questioning to resolve the highest-leverage remaining uncertainty
- Skipped / Already-Satisfied Stages: None beyond already-captured context
- Expected Output: A lower-ambiguity spec with tighter boundaries and fewer unresolved assumptions
- Best When: Residual ambiguity is still too high, the user wants stronger clarity, or the above-threshold / early-exit warning indicates too much risk to proceed cleanly
- Next Recommended Step: Return to one of the execution handoff contracts above once the spec is sufficiently clarified
Residual-Risk Rule: If the interview ended via early exit, hard-cap completion, or above-threshold proceed-with-warning, explicitly preserve that residual-risk state in the handoff so the downstream skill knows it inherited a partially clarified brief.
IMPORTANT: Deep-interview is a requirements mode. On handoff, invoke the selected skill using the contract above. Do NOT implement directly inside deep-interview.
Deep-interview reads runtime defaults from the first existing config source in this order:
- Repository-local
.omx/config.toml - Repository-root
omx.toml - User-global
~/.omx/config.toml
This section is currently a deep-interview-specific runtime override surface, not a general replacement for Codex config.toml or .omx-config.json model/env routing.
Malformed config files are ignored fail-soft so $deep-interview activation can continue with built-in defaults.
Explicit --quick, --standard, or --deep invocation flags override defaultProfile.
[omx.deepInterview]
defaultProfile = "standard"
quickThreshold = 0.30
standardThreshold = 0.20
deepThreshold = 0.15
quickMaxRounds = 5
standardMaxRounds = 12
deepMaxRounds = 20
enableChallengeModes = true
Resume
If interrupted, rerun $deep-interview. Resume from persisted mode state via omx state read --input '{"mode":"deep-interview"}' --json.
Recommended 3-Stage Pipeline
deep-interview -> ralplan -> autopilot
- Stage 1 (deep-interview): clarity gate
- Stage 2 (ralplan): feasibility + architecture gate
- Stage 3 (autopilot): execution + QA + validation gate