name: kws-codex-plan-executor description: Use when executing an implementation plan in Codex from a plan path and optional spec/design docs, or when exporting a fresh-session/handoff prompt from the same plan. metadata: version: "2.21.0" updated_at: "2026-05-31"
KWS Codex Plan Executor
Overview
Execute implementation plans in Codex or export a paste-ready prompt from the same inputs.
Default behavior is interactive execution in the current Codex session, with
implementation isolated in a dedicated non-conflicting git worktree under
~/.codex/worktrees/. Runtime state, hooks, learning event payloads, and other
orchestration-only artifacts live under ~/.codex/orchestrator/.
Invocation
Supported arguments:
plan=<abs-or-repo-relative-path>required except resume-only flows.spec=<path>optional.docs=<path1,path2>optional.workspace=<path>optional.resume=latest|<state-path>|<run_id>optional; if multiple candidate active runs exist, stop and ask which run/state to resume.mode=interactive|headless|prompt|handoffoptional, defaultinteractive.subagents=auto|on|offoptional, defaulton;subagents=onis the subagent-first default for eligible executable tasks,subagents=offforces a local-only run, andsubagents=autouses conservative spawning only when the user explicitly requested subagents, delegation, or parallel work.headless_sandbox=workspace-write|read-onlyoptional, defaultworkspace-write;read-onlyis for preflight/prompt verification and blocks edit execution.context_mode=auto|sliced|fulloptional, defaultauto;autouses task packets when a spec exists.context_budget=<positive-int>optional, default60000per task packet.context_threshold=<float>optional, default0.70; values must be in[0.05,0.95].manifest_fallback=full_spec_on_blocker|halt_on_blockeroptional, defaultfull_spec_on_blocker.- Natural-language hints are accepted only after deterministic parser resolution; print the parsed echo line before preflight.
Hard Boundary
Do not use --dangerously-bypass-approvals-and-sandbox unless the user
explicitly requests it and the target is an isolated throwaway repo or CI
sandbox.
Execution modes must not implement from main or the caller's original
checkout. If a dedicated non-conflicting worktree under ~/.codex/worktrees/
cannot be created or selected before task contracts and edits, stop with a
blocker.
Use spawn_agent by default for eligible executable tasks when the resolved
invocation has subagents=on. Use it for subagents=auto only when the user
explicitly requested subagents, delegation, or parallel agent work. Do not spawn
subagents when subagents=auto without an explicit user request, or when
subagents=off.
When dispatching subagents, use task packets, not raw full-plan context. Do not
ask a subagent to infer its write scope from the entire plan.
If an otherwise executable task falls back to local implementation under
subagents=on, record the failed pre-dispatch prerequisite or concrete reason
in the task subagent_strategy. The main agent remains responsible for
post-diff and state review before accepting subagent output.
Core Invariants
- No edits before a 5-line
TASK EXECUTION CONTRACTis stated and recorded:scope,files_to_inspect,allowed_edits,forbidden_edits, andacceptance_command_or_honest_substitute. - Executable tasks may record
unit_manifestwith context, skill, tool, and write policy; finished runs require every completed task to have a valid manifest, includingallowed_write_globsandforbidden_write_globs. - For every new
interactiveorheadlessexecution run, create a run id using<plan-slug>-<YYYYMMDD-HHMMSS>. Create code worktrees at~/.codex/worktrees/<run_id>and orchestration directories at~/.codex/orchestrator/<run_id>. If the worktree path already exists, append a short random suffix before creating it. - The worktree contains only normal repository files and git metadata. Store
state.json,context.json,hooks/,learning_events/, headless logs, and other executor artifacts under~/.codex/orchestrator/<run_id>/. - Before execution, classify dirty worktree changes as
relatedorunrelated. Continue past unrelated dirty files only when they are outside the declared task files; stop before touching related dirty files. - Execution plans may use
Files,Affected files,Modified files,Changed files,수정 파일,변경 파일,대상 파일, or파일headings for task file blocks. Execution mode still stops if no file block is present. - Execution plans may also use fenced
yaml waygent-taskoryaml agentrunway-taskblocks withid,title,dependencies, andfile_claims; these blocks are executable task contracts and satisfy the file-scope requirement when their paths stay inside the repo. - Resume mode uses an explicit state path/run id, or the only active run found
under
~/.codex/orchestrator/. Do not infer between multiple ambiguous active runs.resume=latestscans~/.codex/orchestrator/*/state.json. - In
interactiveandheadlessexecution, record execution-only redacted notable-boundary learning events directly to AgentLens under thekws-cpe.learning.<event>namespace perreferences/learning-log.md. Includerun_id,run_dir_ref, andstate_path_refin payload metadata. These refs are redacted/home-relative, not absolute home paths.promptandhandoffare not logging modes. - Execution runs maintain replay evidence through AgentLens events under
kws-cpe.<event>perreferences/event-journal.md. State remains authoritative; finished state records the AgentLens orchestration run id and, for resume, the last AgentLens event timestamp. - At run init the orchestrator opens an AgentLens run with
agentlens run-open --agent kws-cpe-orchestrator --workspace "$WORKTREE_ABS" --meta plan=...and persists the returned id asagentlens_orchestration_runin~/.codex/orchestrator/<run_id>/state.json. Every AgentLens call is guarded by[ -n "${ORCH_RUN_ID:-}" ]and suffixed with2>/dev/null || true; AgentLens failures must never block plan execution. - Execution runs record
~/.codex/orchestrator/<run_id>/context.jsonbefore edits and storecontext_snapshot_pathpluscontext_basis_hashin state. - Execution runs maintain
context_healthin state at semantic boundaries: after context snapshot creation, after each task, after blocker/error events, before handoff/resume, and before final completion. It must includestatus=green|yellow|red,next_action, andhandoff_ready. - Successful terminal runs set
lifecycle_outcome=finishedand include a passingcompletion_auditwithprompt_to_artifact_checklistandverification_evidence. - Before terminal
lifecycle_outcome=finished, run drift reconciliation withscripts/reconcile_state.py --check; use--repair-safeonly when a safe repair should be persisted. Unresolved blocking drift prevents a finished outcome. - Blocked or failed terminal runs set a non-success
lifecycle_outcomeand a concretehandoff_reason. - New execution state records
subagents_requested=trueby default becausesubagents=onis the default. Recordsubagents_requested=falseonly when the run is explicitly local-only (subagents=off) or conservative auto mode without an explicit subagent/delegation/parallel request. Finished runs cannot retain running or unreviewed subagent records. - For v2.20+ finished runs with
subagents_requested=true, every completed write-capable task recordssubagent_strategy.mode=delegatedmust point to reviewed completedsubagent_runs;mode=local_fallbackmust include a concrete reason and no delegated run ids. - Command observations classify bounded command evidence before root cause is
assigned. Finished runs with
category=unknownobservations must mention the command incompletion_audit.residual_risk. - Prompt-generating artifacts follow
references/cache-strategy.md. The stable prefix role, safety, required-skill, and output-schema content stays before the stable-prefix boundary; run-specific paths, task packets, timestamps, git status, diffs, decisions, and verification evidence stay in the hot tail. Runscripts/audit_prompt_cache.py, and finished runs cannot retain non-emptyprompt_audit.dynamic_marker_violations. - Graphify-aware repositories record
graphify_auditevidence usingscripts/check_graphify_freshness.py. Ifgraphify update .is required after code or meaningful documentation-structure changes, the completion audit records whether the command ran and whether tracked or ignored outputs changed. - Subagent pre-dispatch decisions use
scripts/preflight_dispatch.pybefore spawning for eligible write-capable tasks. The decision is one ofdelegate,local_fallback, orblock;local_fallbackreasons flow into tasksubagent_strategy.reason, anddispatch_decisionswithblockcannot be carried into a finished lifecycle outcome. - In interactive and headless execution, feature, bugfix, refactor, or
behavior-change implementation must invoke
using-superpowersas the skill gate andtest-driven-developmentbefore implementation code. This is not a headless-only rule; headless only needs extra prompt bootstrap because it is a freshcodex execprocess. Record RED evidence before implementing, then GREEN evidence after the fix. - Resolve skill paths from the active skill registry/root mapping before
reading local
SKILL.mdfiles manually. Do not hard-code.systemor any other skill root. If a skill path read fails, re-check the active registry entry and root table before diagnosing the cause; classify it as an operator path-resolution error unless the registry entry itself is proven stale. - When repository instructions mention graphify, read
graphify-out/GRAPH_REPORT.md, compare itsBuilt from commitvalue withgit rev-parse HEAD, rungraphify update .after code changes, and record the outcome incompletion_audit.verification_evidence. Ifgraphify-out/is ignored, record that the update ran but generated outputs were not tracked. - Headless
codex execprompts must bootstrap applicable skills because parent session skill state is not assumed to carry over. Explicitly includeusing-superpowersandtest-driven-developmentfor implementation work. - Headless final output follows the structured result shape documented in
templates/headless-output-schema.jsonwhen schema output is available.
Workflow
- Resolve and verify paths. Prefer explicit paths; infer only when one workspace and one plan are unambiguous.
- Select mode. Read
references/mode-contracts.mdif behavior is not obvious. - For
promptorhandoff, usetemplates/fresh-session-prompt.txtandreferences/prompt-export-checklist.md. - For
interactive, followreferences/execution-cycle.md. - For
headless, followreferences/headless-runner.md. - Maintain
~/.codex/orchestrator/<run_id>/state.jsonusingreferences/state-schema.md; keep repository worktrees free of executor runtime artifacts. - Build
context.jsonfor execution modes before edits, maintaincontext_health, and record completion proof before reporting a finished lifecycle outcome. - For execution modes, record notable-boundary learning events using
references/learning-log.md. - Validate using scripts before claiming completion.
Stop Rules
- Missing or unreadable plan: ask one short question or report blocker.
- Dirty worktree with related ambiguity: stop and report.
- Missing or unusable dedicated execution worktree: stop and report.
- Ambiguous
resume=latestwith multiple state files: stop and ask. - Missing
Files:blocks in execution mode: stop before edits. - Unclear acceptance criteria on mid/high risk tasks: stop for clarification unless the plan gives an honest substitute.
- Verification failure without root cause after 3 same-root retries: stop with checkpoint.
Prompt Export
For prompt/handoff mode:
- Verify workspace, plan, spec, and docs paths before inserting them.
- Fill every
{{...}}token intemplates/fresh-session-prompt.txtor remove the optional section. - Keep conservative Spark evidence packing unless the user requests no Spark,
no model optimization, or
gpt-5.5 only. - Include
templates/spark-scout-bullets.ko.txtonly when the user explicitly asks for broader Spark/model scout routing. - Run the checklist in
references/prompt-export-checklist.md.
Prompt and handoff modes are export-only. Do not create ~/.codex/orchestrator
artifacts, create worktrees, execute tasks, or report completion artifacts in
these modes. Return exactly one fenced text block containing the generated
prompt. Handoff export must include the literal HANDOFF CHECKPOINT; no-Spark
or gpt-5.5 only exports must still include the literal gpt-5.5 high while
omitting Spark routes.
Validation Matrix
| Mode | Required checks before completion |
|---|---|
interactive |
scripts/parse_plan.py, context.json, context_health, changed-project tests or honest substitute, prompt cache audit, Graphify audit when applicable, dispatch decision evidence for write-capable subagent tasks, passing completion_audit for lifecycle_outcome=finished, scripts/validate_state.py |
headless |
scripts/parse_plan.py, context.json, context_health, acceptance command or honest substitute, prompt cache audit, Graphify audit when applicable, dispatch decision evidence for write-capable subagent tasks, passing completion_audit for lifecycle_outcome=finished, scripts/validate_state.py, headless JSONL/final artifact review |
prompt |
evals/check_prompt.py or the prompt export checklist when no fixture exists |
handoff |
evals/check_prompt.py or the prompt export checklist, plus source state/path readability |
Maintenance
Use references/change-protocol.md before editing this skill. Update
HISTORY.md, ARCHITECTURE.md, package metadata, and eval baselines for
behavior changes.
For eval harness runs, the outer harness runs evals/check_execution.py. The
target executor must not inspect fixture YAML, baseline files, .harness
metadata, or expected values.