name: create-cli description: > This skill should be used when the user asks to "design a CLI", "help me design command-line flags", "what flags should my tool have", "create a CLI spec", "refactor my CLI interface", "design a CLI my agent can call", or wants to design command-line UX (args/flags/subcommands/help/output/errors/config) before implementation or audit an existing CLI surface for consistency and composability. argument-hint: "[tool-name and one-line description]" allowed-tools: - AskUserQuestion - Read - Glob - Grep - Bash - Write
Create CLI
Design CLI surface area (syntax + behavior), agent-aware, human-friendly.
Phase 1 — Prepare
Read ${CLAUDE_PLUGIN_ROOT}/skills/create-cli/references/cli-guidelines.md. Apply it as the default CLI rubric, including the Agent Ergonomics section.
For new CLI designs, also read ${CLAUDE_PLUGIN_ROOT}/skills/create-cli/references/language-selection.md to inform the language recommendation in Phase 2. Skip it for audits — the language is already chosen.
Proceed when cli-guidelines.md is loaded.
Phase 2 — Clarify
Determine whether this is a new design or an audit from the user's trigger.
New design
Ask, then proceed with best-guess defaults if user is unsure:
- Command name + one-sentence purpose.
- Primary consumer: agent/LLM, human at a terminal, scripted automation, or mixed.
- Input sources: args vs stdin; files vs URLs; secrets (never via flags).
- Output contract: human text by default,
--jsonfor structured output, exit codes. - Backend/data layer: does it wrap an existing API? Source of the surface — OpenAPI/docs, live URL, or HAR capture (undocumented/browser-fed)? (cli-guidelines.md → Wrapping an existing API.)
- Interactivity: prompts allowed? need
--no-input? confirmations for destructive ops? - Config model: flags/env/config-file; precedence; XDG vs repo-local.
- Language & distribution: ask for the user's preferred implementation language, or offer to recommend one. Ask whether a single binary (no runtime needed on target machine) is required, or whether a runtime dependency is acceptable. Apply language-selection.md to recommend if the user is unsure. Platform: macOS/Linux/Windows.
If an existing CLI spec or tool description is provided, read it first — skip questions already answered by it.
Data Layer gate — required when the tool reads from a backend. Run the Data Layer Decision
scorecard (cli-guidelines.md → Stateful CLIs); record adopt cache or stateless + a one-line
rationale. Decide this before drawing the command tree — it changes whether sync/local-read
subcommands exist at all.
Audit
Ask:
- CLI name and source location (repo path, or provide
--helpoutput). - Primary consumer: agent, human, or mixed.
- Known pain points or specific areas to focus on.
Then explore the codebase: use Glob/Grep to find command definitions, flag registrations, output formatting, and error handling. Run <cli> --help via Bash to capture actual behavior.
Proceed when answers are confirmed or user is unsure — use best-guess defaults.
Phase 3 — Conventions
Apply the conventions from cli-guidelines.md (loaded in Phase 1), including the Agent Ergonomics section. The rules below are the key conventions to enforce — cli-guidelines.md provides the full rubric for edge cases.
If primary consumer is human-only, the Errors and Reduce Tool Calls subsections are optional — apply them only if the user wants script-friendliness.
Output
- Default output is human-readable text; an explicit
--json/--plainflag sets the data format and overrides TTY state. Cosmetics (color, spinners) may TTY-detect; the data format must not rely on it (see cli-guidelines.md → Agent Ergonomics for the TTY/PTY rationale). - List commands in
--jsonmode use NDJSON (one JSON object per line) — enables streaming andjqpiping without buffering. For paginated results with metadata, a JSON object with anitemsarray is acceptable. If the CLI extends an existing ecosystem that uses JSON arrays (kubectl, aws, gh), match the ecosystem convention. - Primary data to stdout; diagnostics/errors to stderr.
- Suppress ANSI codes, progress spinners, and decorative output when
--jsonis passed or when stdout is not a TTY.
Errors (agent/mixed consumers only)
- When
--jsonis active, emit error objects on stderr:{"error": "<snake_case_code>", "message": "...", "hint": "<exact CLI invocation or null>"}— so agent callers can route recovery logic without parsing free-text stderr. Thehintfield must be an executable command, not prose. - Exit codes:
0success,1runtime error,2invalid usage; for agent/mixed consumers, extend with the typed table from cli-guidelines.md → Exit codes (typed) (3not-found,4auth,5upstream,7conflict) — apply identically across all subcommands so agents branch on the code.
Flags
-h/--helpalways shows help; ignores other args.--versionprints version to stdout.--jsonpreferred for structured output.--output json/-o jsonacceptable when the CLI needs multiple output formats (yaml, table, csv) under a single flag. Pick one and apply consistently.- For commands an agent calls in a loop, offer
--compact(opt-in): same JSON shape, minimal whitespace, essential fields only —--jsonstays the full-fidelity default. See cli-guidelines.md → Output defaults. - Consistent flag names across all subcommands for the same concept (
--id,--force,--json) — agents learn the naming pattern once and apply it everywhere without guessing. - Prompts only when stdin is a TTY;
--no-inputdisables prompts.--non-interactiveacceptable if the ecosystem already uses it. - Destructive operations: interactive confirmation; non-interactive requires
--force. - Respect
NO_COLOR,TERM=dumb; provide--no-color. - Handle Ctrl-C: exit fast; bounded cleanup; crash-only when possible.
Reduce Tool Calls (agent/mixed consumers only)
- Compound output: operations return enough data to avoid a follow-up call.
createreturns the new resource's ID and key fields.deleteechoes what was removed. - Rich JSON defaults: in
--jsonmode, return full objects not just IDs. - Bounded lists: list commands default to a safe limit (e.g., 50 items) with
--limitto adjust. In JSON mode, includehas_more(bool) and optionallynext_cursorfor keyset pagination. Unbounded output wastes tokens and risks context overflow for agent callers. - Idempotent by default: where possible, commands are safe to repeat; document preconditions explicitly — agents rely on safe retries for error recovery without human intervention.
Apply all applicable conventions, then proceed to Phase 4.
Phase 4 — Deliver
Audits
Evaluate the existing CLI against every Phase 3 subsection. For each convention, state: what the CLI does today, whether it conforms, and what to change. Also check:
- Flag naming consistency across subcommands.
- Help text quality (examples present, common flags first, fits one screen).
- Config precedence (flags > env > project config > user config > defaults).
- Destructive-op safety (confirmations, --force, --dry-run).
- Shell completion availability.
- Data layer fit: if the CLI reads from a backend, run the Data Layer Decision scorecard; flag when it re-fetches live data that a local cache + compound queries would serve (cli-guidelines.md → Stateful CLIs).
Produce a gap report organized by severity: Breaking (requires API change), Major (agent-breaking or convention violation), Minor (cosmetic/polish). Each finding: current behavior, convention violated, recommended fix with migration risk (none/low/breaking).
New designs
Produce a compact spec the user can implement. Include all relevant sections:
- Command tree + USAGE synopsis.
- Args/flags table (types, defaults, required/optional, examples).
- Subcommand semantics (what each does; idempotence; state changes).
- Output rules: stdout vs stderr;
--jsonfor structured output;--quiet/--verbose. - Error + exit code map (top failure modes).
- Safety rules:
--dry-run, confirmations,--force,--no-input. - Config/env rules + precedence (flags > env > project config > user config > system).
- Data layer decision:
adopt cache|statelessverdict + rationale (required when the tool reads from a backend). - API provenance (when wrapping an existing API): source + endpoint→command mapping; for HAR, the secret/auth/coverage/fragility/ToS checks.
- Shell completion story (if relevant): install/discoverability; generation command or bundled scripts.
- 5–10 example invocations (common flows; include piped/stdin examples).
Use this skeleton, dropping irrelevant sections:
- Language & distribution:
Go·cobra· single binary ·goreleaserfor CI (Omit if language was not determined.) - Name:
mycmd - One-liner:
... - USAGE:
mycmd [global flags] <subcommand> [args]
- Subcommands:
mycmd init ...mycmd run ...
- Global flags:
-h, --help--version-q, --quiet/-v, --verbose(define exactly)--json(structured JSON output; NDJSON for list commands)
- I/O contract:
- stdout:
- stderr:
- Exit codes:
0success1generic failure2invalid usage (parse/validation)- (add command-specific codes only when actually useful)
- Env/config:
- env vars:
- config file path + precedence:
- Data Layer Decision (required when the tool reads from a backend; omit if no backend):
adopt cache|stateless+ one-line rationale. Ifadopt:synccommand, local schema sketch, local-vs-live reads, write invalidation note. - API Provenance (only when wrapping an existing API; omit for from-scratch tools): source (OpenAPI/URL/HAR); subcommand ← method+path mapping; for HAR, the five checks.
- Examples:
- …
See ${CLAUDE_PLUGIN_ROOT}/skills/create-cli/examples/example-cli-spec.md for a complete worked example.
If the spec is destined for a skill body or CLAUDE.md, omit unused sections entirely (do not mark them "N/A") and limit examples to ≤5 invocations that each demonstrate multiple patterns.
Phase 5 — Verify
For new specs: confirm the spec covers all applicable sections from the Phase 4 skeleton. Verify the examples section demonstrates at least: --json output, error recovery (if agent/mixed consumer), and one piped/stdin usage.
For backend/API-wrapping specs: confirm a Data Layer Decision verdict is recorded with a rationale (not left implicit), and — when the source is a HAR/undocumented API — that no secret values appear anywhere in the spec (only env-var names) and the coverage/fragility caveats are stated.
For audits: confirm the gap report addresses every Phase 3 subsection and includes at least one example invocation showing the recommended fix for each Major finding.
Skill is complete when verification passes.
Notes
- Once language is selected (Phase 2), include the idiomatic parsing library in the spec (see language-selection.md). If language remains undetermined, omit the library.
- If the request is "design parameters", do not drift into implementation.