name: codd-evolve description: | Conversationally evolve an existing CoDD project. Use when the user describes a functional change in natural language ("add logout button", "change course model to master + delivery target", "remove daily log step") and you need to update requirements, design docs, lexicon, source code, and tests together while maintaining CoDD coherence. Brownfield modification, NOT greenfield generation, NOT pure bug fix.
CoDD Evolve — Conversational Brownfield Evolution
Take a Lord-style natural-language change request and automatically determine which design docs, lexicon entries, source files, and tests must move together to preserve CoDD coherence. The user expresses intent; this skill figures out scope.
When to Use
- The user describes a functional change to an existing system, not a bug
- Examples of trigger phrases:
- "Add a logout button to the admin nav"
- "Change course management — courses should be a shared master with separate delivery targets"
- "Restructure the learner list to drill down: facility → learner → course → progress"
- "Remove the daily log append step from the karo completion flow"
- "受講者管理を施設フィルター起点に変更"
- The user does NOT want to think about which design docs, lexicon entries, or source files to touch
- The project is already CoDD-initialized (has
codd/codd.yamland at least the design doc layout)
Do NOT use this for:
- Greenfield generation from scratch — use
/codd-initthen/codd-generate - Pure bug fix where requirements and design are correct — use
codd fixorcodd fix [PHENOMENON] - Reverse-engineering an undocumented codebase — use
codd extract(or/codd-restore) - Single-doc impact analysis only — use
/codd-impact - Code-only refactoring with no behavioral change — use
codd propagate
What This Does — Role Separation
This skill makes a single contract explicit:
| Layer | Who decides | What |
|---|---|---|
| Intent | User | "I want X" in natural language |
| Strategic constraints | User | North star, hard prohibitions, breaking-change tolerance |
| Impact scoping | This skill | Which design docs, lexicon, source files, tests are affected |
| Doc updates | This skill + CoDD | Update requirements + every affected design doc in coherent order |
| Lexicon updates | This skill | Detect new terms, ask user before adding, then update lexicon |
| Implementation | CoDD CLI | codd implement from updated design |
| Verification | CoDD CLI | codd verify — must reach red 0 |
| Coherence finalization | CoDD CLI | codd propagate for cross-doc consistency |
| Failure judgment | You (orchestrator) | Decide retry vs ask user vs abort |
| Final approval | User | Review the PR / diff post-hoc |
The user must never have to choose which file to touch.
Workflow
Step 1 — Confirm prerequisites
Before starting, verify:
- Current directory is a CoDD-initialized project (
codd/codd.yamlexists) - Working tree is clean OR uncommitted changes are intentional (warn the user otherwise)
codd verifycurrently passes (red 0) — if not, the user should fix existing red first, percodd fix [PHENOMENON]prerequisite- If
codd verifyreturns exit 0 with silent stdout (a known mode where build/test phases run but emit no text), fall back tocodd dag verifyfor an explicit red-count readout. Use both signals when the baseline is uncertain. - If runtime verification_test nodes are unsuitable for the local environment, prefer an explicit config or skip over silent omission: set
verify.verification_timeout.total_seconds/per_node_seconds, or runcodd verify --runtime --runtime-skip verification-testand preserve the reported SKIP evidence.
If red exists, STOP and surface it. Do not attempt to layer new changes on a red baseline.
Step 2 — Parse intent and classify
Classify the user's request into one of:
| Type | Marker | Likely affected docs |
|---|---|---|
add_feature |
"add X", "新規追加" | requirements + at least one design doc + lexicon (new terms) + new source + new tests |
change_behavior |
"change X to Y", "〜に変更" | requirements + affected design docs + lexicon (term-meaning shift) + modified source + updated tests |
change_data_model |
"data model", "schema", "table", "entity" | database_design + api_design + lexicon + migrations + source + tests |
change_ux |
"UI", "screen", "navigation", "画面" | ux_design + frontend source + frontend tests |
remove_feature |
"remove X", "削除" | requirements (mark removed) + design docs (remove sections) + source (remove) + tests (remove or update) + lexicon (deprecate term) |
cross_cutting |
Touches auth/permissions/i18n/tenancy | auth_design + every callsite + tests for every role |
If classification is ambiguous, ask one clarifying question (see Step 3).
Drift detection against existing design
After classification, scan the affected design docs for pre-existing references to the proposed change. Examples:
- The intent says "add logout button to admin nav," but
ux_design.mdalready documents a logout entry in the tenant_admin / learner sidebars (impl never caught up). - The intent says "add delivery_target table," but
database_design.mdhas an Open Question (OQ-DB-NN) that proposes the same structure under a different name.
When such drift is found, classify it:
- Drift A — broader-than-intent: design covers more roles/cases than the intent. Either (a) flag to the user and ask whether to keep the narrower intent or align to the design, or (b) treat as a Step 3 gate-5 "ambiguous scope" trigger.
- Drift B — design proposed, impl absent: an Open Question or TODO matches the new intent. Reuse the existing terminology and mark the OQ as resolved.
- Drift C — contradiction: design says X, intent says not-X. Halt as a Step 3 gate-3 "structural impossibility."
Recording drift in the report prevents silent vocabulary divergence on subsequent evolutions.
Step 3 — Stop-and-ask gates
Stop and ask the user only when one of these triggers fires:
- New lexicon term required. The change introduces a vocabulary not in
project_lexicon.yaml. Ask: "I'll add<term>to the lexicon meaning<definition>. OK?" - Breaking change to existing behavior. Existing users / callers will see different output. Ask: "This changes how
<X>behaves for existing users. Is breaking change acceptable?" - Coherence is structurally impossible. Requirements would contradict an existing invariant. Surface the contradiction; do not proceed silently.
- Cross-cutting scope explosion. The change touches more design docs than the user likely realized (rule of thumb: >4 docs). Confirm scope before charging ahead.
- Ambiguous role/scope. "Add logout" — for which role? Or all roles? Ask once.
- 1:N/N:N data model change -> UI page topology. When
Step 2 classification == change_data_modeland the change introduces a 1:N or N:N relation that does not yet have anoperation_flow.ui_patterndeclared for the parent/child pair, ask which UI topology should be used: (a) single screen with everything inline, (b) master-detail on the parent's detail page, (c) drilldown to a dedicated child page, or (d) defer to LLM auto-decision, which is discouraged and may trigger aui_coherencewarning. Record the answer torequirements/*.mdoperation_flow as a new Operation entry.
Pre-approved branch
If the orchestrator (multi-agent system, task YAML, prior conversation, etc.) already records explicit approval for one or more of these gates, treat them as immediately confirmed without re-asking. Examples:
- A task YAML that states "lexicon
delivery_targetis pre-approved" → skip gate 1 prompt for that term. - A task YAML that states "breaking change accepted by stakeholder" → skip gate 2 prompt.
- A handoff that names the role explicitly ("update only the central_admin nav") → skip gate 5 prompt; if drift detection (Step 2) finds the design covers more roles, surface the drift in the report instead of blocking.
- A task YAML that states "ui_pattern for
<child>is master_detail" → skip gate 6 prompt and record the pre-approved topology source in the report.
Always record in the report which gates were short-circuited and the source of the prior approval. Pre-approval never applies to gate 3 (structural impossibility) — that always halts.
Do not ask the user:
- Which file to touch
- Which doc to update
- What order to do things in
- What to name a commit
- Which version to bump
Step 4 — Execute the coherence chain
Once intent is confirmed, execute in this order (each step's output feeds the next):
1. Update requirements/*.md
- Append new requirement / modify existing / mark deprecated
- Preserve frontmatter, traceability IDs, and Bloom levels
2. Update affected design/*.md docs (in dependency order)
- Determine order via codd's existing CEG (depends_on graph)
- Update each doc body to reflect the new requirement
- Preserve frontmatter exactly
3. Update project_lexicon.yaml if needed
- Only after user confirmed in Step 3
- Keep alphabetical / grouped order if existing convention
4. Run codd implement to (re)generate source from updated design
- For incremental change, codd implement updates only affected modules
- For pure data model changes, also generate migration files if applicable
- **Generated-code impact check**: if the project keeps codd-generated output under `src/generated/**` (or equivalent), classify whether the change requires regenerating those modules or whether it stays in the hand-edited area (handlers, UI, tests). Record the decision in the report so future evolutions know whether `src/generated/**` was intentionally untouched.
- **Prerequisite (cmd_345 K-3)**: if any design doc declares `operation_flow`, ensure `codd.yaml` has `ai_commands.impl_step_derive` set. Without it, `operation_flow_hint()` is silently skipped and declared UI patterns will not influence generation. Verify with `grep impl_step_derive codd/codd.yaml`; CoDD also emits a `WARNING` on stderr when this gap is detected.
5. Update tests
- Tests for new requirements MUST be added (no silent skip)
- Tests for removed requirements MUST be removed
- Tests for changed behavior MUST be updated
6. Run codd verify
- MUST reach red 0
- If red persists, see Step 5 (failure handling)
7. Run codd propagate
- Final cross-doc consistency pass
- Catches any drift between source-as-implemented and design-as-written
8. Runtime smoke verification (MANDATORY — not optional)
- Run `codd verify --runtime` from the project root and paste or link the generated runtime smoke report.
- `codd verify --runtime` automatically checks:
a. Local DB up via `codd.yaml runtime_smoke.db_check.command`
b. Dev server up via `runtime_smoke.dev_server.url`
c. Smoke connectivity via `runtime_smoke.smoke_connectivity[]`
d. Real-browser E2E via `runtime_smoke.e2e.command`
e. Opt-in CRUD flow reflection via `runtime.crud_flow_targets[]`
- For every visible or `operation_flow` command/control/action that mutates
state or emits a business result, add or reuse a CRUD flow target, an
`action-outcome` target, or an equivalent E2E proving: trigger → server
acceptance/mutation → re-fetch or observable outcome → visible reflection,
persistence, emitted event, expected output, or absence as appropriate. A
green GET smoke alone is not enough for mutating actions.
- All results are written with raw logs to `reports/runtime_smoke_{{timestamp}}.md` unless the project config overrides the path.
- Self-reported runtime smoke is not acceptable evidence. If `--runtime-skip <category>` is used, including `--runtime-skip verification-test` or `--runtime-skip crud-flow`, the report must show the skipped category explicitly and it must never be described as passed.
- If `codd verify --runtime` fails: the change is NOT done. Either fix forward or revert. Reporting done with the server down is a critical violation of CoDD coherence.
Never reorder these steps. Doc updates always precede source updates — that is the CoDD coherence invariant. Step 8 is the actual completion gate — Steps 1-7 produce coherent artifacts, Step 8 proves the user can actually use them.
Step 5 — Failure handling
If codd verify red persists after Step 4:
- First retry: run
codd fixonce to let CoDD self-repair common issues - Second attempt: surface the failing test output, classify the cause
- Test outdated → update test
- Design contradicts requirement → ask user
- Implementation cannot match design → ask user whether design is wrong or impl approach is wrong
- Do not loop more than 3 times. After 3 failed attempts, STOP and report to user with concrete diagnostics
Local database unavailable
change_data_model work often needs a migration command (e.g. prisma migrate dev) that requires a running local database. If the DB cannot be reached:
- Do not apply the migration to any non-local target (e.g. staging, production VPS).
- Author the migration by hand under the project's migrations directory (Prisma example:
prisma/migrations/<timestamp>_<slug>/migration.sql). Mirror the conventions of existing files in that directory (column order, index naming, foreign-key style). - Validate the schema declaration alone —
prisma validate(or the framework's equivalent) — to confirm the model file parses and matches expectations. - Note in the report that the migration is generated but unapplied, and call out what the user must run locally once the DB is back (
prisma migrate deployfor hand-authored migrations). - Treat this as an acceptable verification path only when
codd dag verifyandcodd propagatealso pass; it is not a substitute for fullcodd verifywhen build/test phases are reachable.
Step 6 — Report
Generate a concise summary for the user:
Updated:
- requirements/foo.md (added: ログアウト機能)
- design/auth_design.md (added: NextAuth signOut handler)
- design/ux_design.md (added: 中央管理者ナビ Logout ボタン)
- src/components/AdminNav.tsx (added)
- src/app/api/auth/signout/route.ts (new)
- tests/e2e/logout.spec.ts (new)
Lexicon: no changes
Verify: red 0 ✅
Propagate: 0 drift ✅
Runtime smoke (Step 8):
- `codd verify --runtime`: ✅
- report: reports/runtime_smoke_20260517_210000.md
Done: ✅ (user can open the app and use the new feature)
If any Step 8 check is ❌, the change is NOT done. Either fix forward or revert; never report done with the runtime broken.
Suggest a commit message and offer to commit. Do not auto-commit unless the user confirms.
CoDD Commands This Skill Orchestrates
| Command | When invoked | Why |
|---|---|---|
codd verify (entry guard) |
Step 1 | Confirm clean baseline |
codd impact |
Step 2 | Determine which design docs are downstream of the proposed change |
codd implement |
Step 4 (step 4) | Generate source from updated design |
codd verify |
Step 4 (step 6) | Confirm coherence after change |
codd fix |
Step 5 (retry) | Self-heal common verify failures |
codd propagate |
Step 4 (step 7) | Catch final source-design drift |
Examples
Example 1 — Feature addition (clear scope)
User: "Add a logout button to the admin nav."
Skill:
- Classify →
add_feature, scope = auth + ux (single role: admin) - No new lexicon, no breaking change → no Step 3 questions
- Execute chain → requirements + auth_design + ux_design + impl + tests + verify + propagate
- Report → "Added logout for central_admin role. 6 files touched, verify ✅"
Example 2 — Data model change (asks once)
User: "Course management should be a shared master with separate delivery targets."
Skill:
- Classify →
change_data_model, scope = database + api + lexicon + migrations + ux - Step 3 trigger: new term
delivery_target(配信先) not in lexicon → ASK- "I'll add
delivery_targetto the lexicon as 'a tenant/facility to which a course is distributed; many-to-one with course'. OK? Also, this changes the existing 1-course-1-tenant structure — migration required. Breaking change for existing course records is acceptable?"
- "I'll add
- After user confirms → execute chain including migration generation
- Report
Example 3 — Reject as bug-fix territory
User: "The login page sometimes times out on slow networks."
Skill:
- This is a bug, not a functional change. Decline politely:
- "This sounds like a bug. Use
codd fix \"login times out on slow networks\"instead —codd-evolveis for intentional design changes."
- "This sounds like a bug. Use
Example 4 — Reject as greenfield
User: "Start a new SaaS project for restaurant reservations."
Skill:
- This is greenfield. Decline:
- "For new projects, use
codd initthencodd planfollowed bycodd generate.codd-evolveis for evolving existing CoDD projects."
- "For new projects, use
Absolute Constraints
These are non-negotiable. Violating any of them defeats the purpose of CoDD:
- Never edit source without a corresponding design doc update. If the change requires source modification, requirements and design must already reflect it.
- Never silently introduce a new lexicon term. Always ask the user first.
- Never proceed past a red
codd verify. Either retry (max 3) or stop and ask. - Never reorder the chain. Requirements → design → lexicon → source → tests → verify → propagate → runtime smoke. No shortcuts.
- Never bypass user approval for breaking changes. "Breaking" means: existing API contract changes, existing data semantics change, existing user-visible behavior changes.
- Never skip tests for new requirements. A new functional requirement without a corresponding new test is incoherent.
- Never commit without user approval. Stage and propose, but do not commit autonomously.
- Never declare done without runtime smoke verification (Step 8).
codd verifygreen is necessary but not sufficient. Runcodd verify --runtimeand keep the generated raw-log report. Reporting done while DB/dev server is down — or while a regression like migration conflict blocks startup — is a critical violation. Either bring the runtime up and prove it, or do not declare done.
Guardrails
- Use the
coddcommand, notpython -m codd.cli - Run from the project root (where
codd/codd.yamllives) - Each invocation should handle one logical change. If the user bundles multiple unrelated changes ("add logout and also restructure the course list"), split into separate runs
- Preserve all frontmatter exactly — only modify doc bodies and append/remove sections as needed
- When updating docs, do not gratuitously reformat unchanged sections — minimal diff is a feature
- If the user is on a project where
codd verifyhas never passed, do not start by attempting to evolve; recommendcodd extract+codd fixto establish a green baseline first
Troubleshooting
- "I don't know which design doc is affected"
- Read every doc under
docs/design/anddocs/requirements/and classify by frontmattermodules/ topic - Use
codd impactto compute downstream effects from any candidate doc - If still uncertain, ask the user one targeted question (not a list of 5)
- Read every doc under
- "Lexicon term is borderline new vs existing"
- Treat as new. Always ask. The cost of asking is low; the cost of silent vocabulary drift is high
- "Verify keeps failing after retries"
- Stop. Report which test, which file, which line. Let the user decide whether the design or the impl is wrong
- "User keeps adding requirements mid-execution"
- Politely defer: "I'll finish this change first (estimated N minutes), then handle the next one"
Output Format
When reporting back to the user, always include:
- Intent classification — what kind of change you understood
- Files touched — grouped by docs / source / tests
- Lexicon delta — new / changed / deprecated terms (or "no changes")
- Verify status — red 0 ✅ or red >0 with concrete failures
- Suggested commit message — single line, conventional commits format
- Next action — what you recommend the user do (review, commit, request more changes)
- Scope decisions — sub-scopes you intentionally excluded and the reason (e.g. "did not touch
Module.tenant_idbecause it would require redesigning RLS isolation policies, out of scope for this evolution"). Required forchange_data_modelandcross_cuttingtypes; optional but recommended for others. Pre-approval short-circuits from Step 3 should be listed here as well.
Why This Skill Exists
CoDD's value is coherence: requirements, design, lexicon, source, and tests move together so no document lies about the system. The CLI form (codd plan, codd implement, codd verify) makes this explicit and reproducible — ideal for greenfield projects and CI automation.
But the CLI form has a cost in Brownfield modification: the user must remember which command to run, in which order, with what arguments. Each codd fix "PHENOMENON" invocation is a context switch.
codd-evolve removes that cost by accepting natural language ("add logout button") and orchestrating the CLI chain underneath. The user expresses intent; coherence is preserved automatically. The CLI remains the engine; this skill is the conversational front.
This is not a replacement for the CLI. Both are first-class:
- CLI for Greenfield, CI, automation, education, and third-party orchestrators
- Skill for Brownfield, conversational modification, daily evolution within Claude Code
Use the right tool for the right phase.