name: cli-spec-to-goal
description: Convert a Node.js CLI tool specification (even a vague one) into a Codex /goal-ready bundle — GOAL.md, VERIFY.md, PROGRESS.md inside goals// — with AI-agent-friendly patterns (--json, stdout/stderr separation, exit codes, structured errors, TTY detection, --dry-run) baked into every goal. Asks clarifying questions in focused batches (each with options + recommendation + reasoning), reaches 95% confidence, optionally scaffolds a missing CLI project (bin/, src/, package.json, AGENTS.md, CLAUDE.md behavioral snippet), and emits a tailored /goal command. Handles both new CLIs and adding commands to existing ones. Supports JavaScript (default) and TypeScript, commander.js, Vitest, conditional SQLite and cosmiconfig. For Node.js CLI tools, not web apps, servers, or non-Node projects.
cli-spec-to-goal — Vague CLI Idea → Codex /goal Bundle
Turn a rough Node.js CLI tool idea into the three files Codex /goal needs to drive autonomous implementation:
goals/<slug>/
GOAL.md what done looks like (full CLI tool template with AI-agent patterns)
VERIFY.md how /goal proves it's done (host-native verification)
PROGRESS.md the audit trail /goal will populate
The skill optionally scaffolds a missing CLI project (bin/, src/, package.json, AGENTS.md, CLAUDE.md) and prints a tailored /goal command at the end.
Why this skill exists
A full multi-prompt workflow kit is overkill for CLI tools that fit in one or two goal slices. This skill collapses the "spec → goal trio" hop into a single guided conversation that:
- batches clarifying questions (so the user makes 2-3 decisions per round, not 20)
- recommends a default for every choice (so they can move fast when the recommendation looks right)
- bakes in six AI-agent patterns that make every CLI tool machine-friendly from the start
- writes the same template shapes already validated by the Codex /goal operator pattern
It does not replace the multi-prompt kit for complex CLI suites with 10+ commands. Use the kit for big builds; use this skill for the small-to-medium ones.
The six AI-agent patterns
Every GOAL.md, VERIFY.md, and scaffold generated by this skill includes these patterns. They make CLI tools safe and useful for AI agents to invoke — not just humans.
--jsonflag on every command — stdout receives only valid JSON; no spinners, colors, or progress bars. Gold standard: theghCLI.- stdout / stderr separation — data to stdout, diagnostics to stderr, always.
- Non-interactive TTY detection — never block on stdin when
!process.stdin.isTTY. - Meaningful exit codes — 0 = success, 1 = user/input error, 2 = usage error.
- Structured error format — when
--jsonis active:{ "error": "CODE", "message": "...", "suggestion": "..." }. --dry-runon mutating commands — any create/update/delete command previews without executing.
These patterns reinforce each other: --json + structured errors = machine-parseable success and failure; TTY detection + --dry-run = safe automated execution.
Core flow
- Probe the project — inspect the cwd to learn what's already in place.
- Decide single goal vs. multi-goal — judge complexity from the spec; if multi-shaped, offer to split via
goals-plan.md. - Ask in focused batches — 2-4 questions per round, each with options + recommendation + reasoning, until 95% confident.
- Optionally scaffold — if the project is missing pieces and the user agrees, create only what they confirm.
- Write the goal trio — GOAL.md + VERIFY.md + PROGRESS.md at
goals/<slug>/. - Hand off — print the file paths + a tailored
/goalcommand + a short "how to run" note.
Never write any files until confidence on what's about to be produced is at 95%+.
Step 1 — Probe the project
Before asking the user anything, inspect the cwd. The point is to ground recommendations in reality so questions don't waste the user's time on things already settled by the project state.
Look for:
| Signal | What it tells us |
|---|---|
package.json → bin field |
Existing CLI entry point; binary name already chosen |
package.json → type field |
ESM vs CJS — skip module system question |
package.json → engines |
Node version constraint |
package.json → scripts |
Existing test/lint/build commands |
package.json → dependencies |
Arg parsing lib already chosen (commander/yargs/meow/citty) |
package.json → devDependencies |
Test framework (vitest/jest), TS, linter (eslint/biome) |
tsconfig.json |
TypeScript project — skip language question |
bin/ or src/cli.* |
Existing entry points |
.env / .env.example |
External service config pattern in use |
goals/ folder |
Previous goal runs; pick non-conflicting slug |
AGENTS.md |
Existing conventions to respect |
CLAUDE.md |
Existing behavioral guidance to reference |
vitest.config.* / jest.config.* |
Test infra in place — skip framework question |
.gitignore |
Check coverage for node_modules, .env, dist/ |
eslint.config.* / biome.json |
Lint setup exists |
*.db / *.sqlite / drizzle/ / migrations/ |
Data storage in use — triggers data storage questions |
.toolrc / *.config.js / cosmiconfig in deps |
Config file pattern already wired |
src/auth.* / src/oauth.* |
OAuth flow already implemented |
open or opener in deps |
Browser-opening for OAuth consent flow |
.env containing CLIENT_ID, CLIENT_SECRET, REDIRECT_URI, or REFRESH_TOKEN |
OAuth credentials pattern in use |
Use this to inform Step 3 recommendations. Skip questions that the probe already answers.
New vs. existing CLI detection:
binfield in package.json orsrc/cli.*exists → existing CLI, focus on adding commands- Neither present → new CLI from scratch
Step 2 — Judge complexity
Read the user's spec and decide whether it fits one goal or wants to be split.
Multi-goal signals:
- More than ~3 distinct commands/subcommands being built
- More than ~5 AC spread across unrelated concerns (parsing + API + file I/O + formatting)
- Clear "core library first, then CLI wrapper" or "MVP then enhancements" reading
- Walking-skeleton step needed (e.g., basic
--helpbefore vertical features) - Multiple unrelated subsystems (HTTP + filesystem + database + streams)
- Multiple output format requirements (JSON, table, plain text)
If multi-goal, surface it before other clarifying questions:
This spec looks like it could be 2-3 separate goals. I can either generate one combined goal or write a
goals-plan.mdand scaffold the first slice. Which do you prefer?
If the user picks split, write goals-plan.md at the project root with a numbered list of proposed goals (each with a 1-2 line description), then ask which slice to generate first. Re-run the skill later for the next slice.
Step 3 — Ask clarifying questions
Use the AskUserQuestion tool when available. Fall back to natural-language Q&A with the same shape: 2-4 questions per round, each with 2-4 options, recommended option first and labeled (Recommended), plus a one-line reason for the recommendation.
Aim for ≤3 rounds total. Skip any question already answered by the spec or the probe.
Round 1 — Identity + Project Context
Only ask questions not settled by the probe.
- Package name: Suggest kebab-case of the tool name. Only for new CLIs.
- Binary name: Usually same as package name. Only for new CLIs.
- New vs. existing: Only if the probe is ambiguous (e.g., package.json exists but no bin field).
- Language: JavaScript (recommended) or TypeScript. Skip if tsconfig detected. JS is the default because it's simpler and good enough for most CLIs; recommend TS if the user mentions types or the project is already TypeScript.
- Module system: ESM (recommended) or CJS. Skip if
typefield found in package.json. ESM because it's 2026. - Arg parsing library: commander (recommended for ≤10 commands) or yargs (for complex subcommand trees) or citty or none. Skip if an arg parser is already in dependencies.
If existing CLI: Replace identity questions with:
- "Which command are we adding?" (extract from spec)
- "Any existing patterns to follow?" (probe-informed — show detected conventions)
Round 2 — Scope + Behavior
- Acceptance criteria: Draft 3-6 AC bullets from the spec yourself, then ask the user to confirm, edit, or add. Don't make them write ACs from scratch — give them a starting list.
- Out of scope: Suggest boundaries (GUI/web interface, daemon/watch mode, Windows-specific behavior, breaking public API changes, platform packaging/installers).
- User stories: For non-trivial specs (>2 AC), propose 1-3 user stories. Skip for single-AC features.
- Subcommand structure: If the spec mentions multiple actions, propose a subcommand tree.
- I/O contract: What does the tool read? What does it produce? What formats? Infer from spec; confirm with user. This feeds the CLI Contract table in GOAL.md.
Round 3 — Configuration + Verification
- Configuration surface: Env vars + dotenv (recommended for secrets), cosmiconfig (for persistent non-secret settings when >3 of them), or CLI flags only. See the decision matrix below.
- External dependencies / auth: If the spec mentions an API or service, ask about credentials and auth.
- Verification approach: Integration tests (recommended for CLI-visible behavior), unit tests (for internal logic), or both (recommended when substantial library code exists alongside CLI).
- Test runner: Vitest (recommended — native ESM/TS, fast) or Jest. Skip if test framework already in project.
Configuration decision matrix:
| Scenario | Recommendation |
|---|---|
| API credentials only | Env vars + dotenv only |
| 1-3 non-secret settings | CLI flags only |
| >3 non-secret persistent settings | Env vars + dotenv + cosmiconfig |
| Per-project tool | cosmiconfig |
| User-global tool | Env vars or XDG config |
Conditional: Data Storage Round
Only surface when the probe detects storage signals (*.db, *.sqlite, better-sqlite3 in deps, drizzle/, migrations/) OR the spec mentions persistence keywords (store, save, history, cache, log, database, persist, record, track, remember, catalog, index).
- Does this tool need to persist data between runs? Yes (describe what) / No.
- Storage approach: SQLite via better-sqlite3 (recommended for structured data) / JSON file (for simple key-value) / No persistence.
SQLite via better-sqlite3 is recommended because: zero setup (no server, connection string, or Docker), single file, synchronous API, AI-agent-friendly (no daemon to manage).
Conditional: Config File Round
Only surface when the tool has many non-secret settings that users would want to persist across runs.
- Should the tool support a config file? Yes (cosmiconfig —
.toolrc,tool.config.js, etc.) / No (CLI flags only). Recommend yes if >3 persistent settings.
Conditional: OAuth Round
Only surface when the probe detects OAuth signals (existing src/auth.*, open in deps, CLIENT_ID in .env) OR the spec mentions OAuth authentication with an external API that requires user consent (not just a static API key).
OAuth keywords: oauth, login, authenticate, authorization, consent, token, refresh token, access token, redirect, callback URL.
- Does this tool need OAuth authentication? Yes / No (env var API key is sufficient). If the spec just mentions an API key, OAuth is not needed — env vars handle it.
- OAuth provider? (e.g., Reddit, Google, Spotify, GitHub). Needed to document the authorization and token endpoints.
- Token storage tier? Tier 1: refresh token in .env (recommended — simple, gitignored) / Tier 2: refresh token encrypted in SQLite (for shared machines or compliance) / Let user choose at runtime during
auth login. - Cache access tokens? Yes, in SQLite with expiry (recommended for frequent invocations — skips network call when token is valid) / No, in-memory only (re-mint each run from refresh token — more secure but slower).
The OAuth flow uses Authorization Code with a temporary localhost redirect server (like gh auth login). The scaffold adds src/auth.js, src/token-store.js, and src/commands/auth.js with auth login, auth status, and auth logout subcommands.
If OAuth is active, it may independently trigger SQLite as a dependency (for Tier 2 or access token caching). When both OAuth and the data storage layer are active, they share the same SQLite database.
Confidence check after each round
After each round, ask yourself: "If I started writing files now, what could go wrong because of something I don't know?"
- If "not much" → proceed to Step 4.
- If meaningful unknowns → ask one more focused round.
- If 3 rounds done and still under 95% → the spec may need a longer conversation or a different approach. Tell the user rather than guessing.
Recommendation heuristics
| Area | Heuristic |
|---|---|
| Identity | Kebab-case from tool name; honor existing package.json name |
| Scope | Narrowest scope that achieves the spec. Out-of-scope wins ties |
| Security | Env vars for secrets, never in stdout. Strictest reasonable approach |
| Verification | Integration tests for CLI behavior, unit tests for library code, both when crossing the boundary |
| Goal slicing | Single goal unless AC clearly spans unrelated subsystems |
| Language | JS default; TS when user mentions types or when adding to a TS project |
| Framework | commander for ≤10 commands; yargs for complex trees; respect existing choice |
Step 4 — Optionally scaffold
If the probe found a new CLI (no existing project structure) and the user agrees, create missing project files. Templates are in references/scaffold-templates.md. Read that file when scaffolding.
Default scaffold (JavaScript):
bin/<tool-name>.js shebang + import
src/cli.js Commander setup, --json, --version, TTY detection
src/config.js loadConfig() from env vars via dotenv
src/errors.js AppError with code, message, suggestion, exitCode
src/commands/ one file per subcommand
tests/cli.test.js smoke tests
.env.example documented env var template
.gitignore node_modules, .env, dist/, *.db
package.json bin, type:module, engines, scripts, deps
AGENTS.md stack, conventions, canonical commands
CLAUDE.md behavioral snippet: when/how AI agents use each command
TypeScript additions (when TS is chosen):
tsconfig.json(strict, nodenext, es2022, outDir: dist)bin/imports fromdist/instead ofsrc/buildandtypecheckscripts added to package.jsontypescriptadded to devDependenciesfilesfield set to["bin", "dist"]
Conditional additions:
src/db.js— when data storage round chose SQLite (better-sqlite3 singleton, WAL mode, foreign keys,initSchema())- cosmiconfig integration in
src/config.js— when config file round chose yes src/auth.js— when OAuth round confirmed (OAuth Authorization Code flow with local redirect server)src/token-store.js— when OAuth round confirmed (token storage with optional AES-256-GCM encryption)src/commands/auth.js— when OAuth round confirmed (auth login,auth status,auth logoutsubcommands)
Collision handling: Never overwrite without explicit user confirmation. Exceptions:
.gitignore— offer to merge missing linesAGENTS.mdandCLAUDE.md— offer to append rather than overwrite- Existing CLI path — skip scaffold entirely, but still generate CLAUDE.md and AGENTS.md updates
Step 5 — Write the goal trio
Generate three files at goals/<slug>/. Templates live in:
references/goal-template.md— full GOAL.md template (16 sections with CLI defaults and AI-agent patterns)references/verify-template.md— full VERIFY.md template (host-native verification: binary smoke, functional, exit code, integration checks)references/progress-template.md— initial PROGRESS.md skeleton
Read these reference files when generating. Substitute placeholders with everything confirmed in Step 3 and observed in Step 1.
Filling rules
- Spec placeholders (e.g.,
{{Tool Name}},{{slug}},{{AC ID}}) — replace with concrete values from Step 3 answers and Step 1 probe. Never leave one in. - Audit placeholders in VERIFY.md evidence format section and PROGRESS.md final evidence block — leave as empty cells for /goal to fill at completion time.
- If a whole section doesn't apply (e.g., "Data / Migration Requirements" for a tool with no persistence), write
Not applicable.plus a one-line reason rather than deleting the section —/goal's completion audit expects the section to exist. - For every AC, give it a stable ID like
AC-001.1. The audit table in GOAL.md and the evidence table in PROGRESS.md reference these IDs. - Match the user's stated scope literally. If they said "read-only tool," omit
--dry-runfrom the AI-agent patterns and note why. - Before writing, scan the generated text for any remaining
{{outside of code-fenced template-instruction comments. If you find any, you missed something — fix it.
Host execution rule
Every command in the goal trio runs directly on the host. There is no container layer. The binary is invoked as node bin/<name>.js (or just <name> if the package is linked). Tests run with npm test. No routing decisions needed.
Simplified sandbox rule
If the harness running /goal sandboxes shell access:
- Local-only tools (no network, no external services) → no sandbox bypass needed.
- API-calling tools → network access to target API hosts.
- Mock-tested API tools → localhost binding on the test port.
- OAuth tools → network access to the provider's authorization and token endpoints, plus localhost binding for the redirect server (default port 8910).
No Docker socket, no playwright browser, no container filesystem concerns.
Test artifact convention
Everything lives on the host:
- Test specs →
tests/ - Fixtures →
tests/fixtures/ - Goal verification scripts →
goals/<goal-dir>/checks/ - Test output artifacts →
goals/<goal-dir>/test-artifacts/
Step 6 — Hand off
After writing the files, print:
Generated:
goals/<slug>/GOAL.md
goals/<slug>/VERIFY.md
goals/<slug>/PROGRESS.md
To run with Codex /goal:
/goal Complete goals/<slug>/GOAL.md. Use goals/<slug>/VERIFY.md as the verification contract. Update goals/<slug>/PROGRESS.md continuously. Treat uncertainty as incomplete.
How to run:
1. Open Codex inside this repo: codex
2. (Optional) /plan Read goals/<slug>/GOAL.md and VERIFY.md and propose an implementation plan.
3. Paste the /goal command above.
4. Review changes via `git diff` before committing.
If scaffold files were created in Step 4, list them above the goal trio under Scaffolded:.
If goals-plan.md was written in Step 2, append a Next slice: note.
New CLI vs. existing CLI
| Scenario | Detection | Behavior difference |
|---|---|---|
| New CLI | No bin field in package.json, no src/cli.* |
Full scaffold offered, all questions asked |
| Existing CLI | bin field exists, entry points found |
No scaffold, questions focus on the new command/feature, probe informs existing conventions to respect |
For existing CLIs:
- Respect existing arg parsing library (don't recommend commander if yargs is already in use)
- Respect existing module system (don't ask ESM/CJS)
- Respect existing test framework
- Focus questions on the new command's scope, I/O contract, and acceptance criteria
- Still generate the full goal trio, but GOAL.md "Source of Truth" references existing files
Stop conditions
Stop and ask the user instead of guessing if:
- The spec implies an external API with credentials not described.
- The spec implies a database schema but the data model isn't clear.
- The slug or project folder collides with something in the repo.
- More than 3 question rounds and confidence still under 95%.
- The user's probe state contradicts their spec (e.g., "new CLI" but
bin/already exists). - The spec describes a daemon, server, or long-running process (not a CLI tool).
- The spec requires platform-specific behavior (Windows services, macOS keychain).
What this skill does not do
- It does not implement the CLI tool. That's
/goal's job. - It does not run tests. That's
/goal's job. - It does not work for non-Node.js projects (Python CLIs, Go CLIs, etc.).
- It does not work for web apps, servers, or daemons.
- It does not fix bugs. For bug fixes, use the bug-fix templates in
codex-goal-templates.mddirectly. - It does not handle publishing readiness (npm publish config, GitHub releases, etc.).