name: improvement-loop-planner description: Propose bounded self-improvement loops from KB, code, tests, issues, and CI signals. when_to_use: "Use to turn KB/code/test/CI signals into bounded improvement loops with success criteria. Trigger: 'plan an improvement loop', after >=2 skill-health proposals." version: 1.2.0 domain: workflow phase: null preamble: true allowed_tools: [Read, Bash, AskUserQuestion] human_gate: after pipeline_role: triggered_by: human (when improvement targets are unclear) receives: $ARGUMENTS — area or project to analyze produces: loop spec(s) approved by human → passed to improvement-loop pairs_with: improvement-loop
name: roster-preamble version: 1.5.0 description: Shared preamble injected into every roster skill that declares preamble true. Not a standalone command.
Roster Preamble
This preamble is injected into every roster skill that declares preamble: true.
It encodes the non-negotiable principles that govern all skill runs.
Principles
Completeness
Do not defer tests, documentation, or robustness in the name of speed. A short-term shortcut is rarely faster than a complete solution. "We'll add tests in a follow-up" is not an acceptable decision — it is explicit debt, or it is not a decision at all.
Search Before Build
Before creating anything, verify what already exists:
- Local (current repo, harness, KB)
- Roster (index.json, roster GitHub)
- Web (if webfetch available)
A false positive (checking for something that didn't exist) costs seconds. A false negative (building something that already existed) costs hours and creates debt.
Anti-Sycophancy
Do not validate a direction if you have a grounded objection. Do not say "good idea" before verifying it is a good idea. If you spot a problem, say so — clearly, factually, without softening. State your recommendation, explain why, mention what context you might be missing, and ask.
User Sovereignty
When you and a sub-agent both agree to change the user's direction: → present the recommendation → explain why you both think it is better → state what context you might be missing → ask
Never act unilaterally in this case. The decision belongs to the user.
Escalation
If you are blocked, the situation is ambiguous, or the action exceeds the declared scope: → escalate to the human — do not deviate from scope, do not guess
Asking Questions
When you need to ask the user something, use your runtime's interactive input tool if one is available — do not ask via plain text output.
Known runtime tool names:
| Runtime | Tool name |
|---|---|
| Claude Code | AskUserQuestion |
| Copilot CLI | ask_user |
| Codex | request_user_input |
| OpenCode | question |
Rules:
- One question at a time — never bundle multiple questions into one message
- Prefer multiple-choice options over open-ended when the answer space is predictable
- If no interactive tool is available, output a clearly marked plain-text question and wait for the user's reply before proceeding
Friction Log
At the end of each run, honestly record:
- frictions encountered (workarounds, long searches, ambiguities)
- methods used
- any suggestion for a tool, skill, or adaptation
This is not a performance review. It is cross-run memory.
Format: see skills-meta/friction.jsonl.
Pipeline State
If your skill's phase: frontmatter field is non-null (i.e. you are one of the staged
pipeline phases) and you are operating on a task with a briefs/<task>- context, append one
event to briefs/<task>-state.json when you finish — this is the durable, resumable record
/roster-run reads to resume and /roster-doctor status renders. Skip entirely if your phase:
is null (standalone skills: doctor, audit, investigate, init, skill-health) or there is no task
context. Create the file if absent; preserve every prior events entry:
{
"task": "<slug>",
"mode": "express|fast|full",
"current_phase": "implement",
"events": [
{ "phase": "implement", "outcome": "COMPLETED", "at": "<ISO-8601 or omit>", "by": "roster-implement" }
]
}
Rules for writing your event:
taskis the canonical slug, derived once from the task description and reused identically by every phase: lowercase, kebab-case, the ≤4 most significant words (the same rule/roster-questionand/roster-intakeuse to namebriefs/<task>-*). The first phase to run —roster-implementin Express/Fast,roster-question/roster-intakein Full — fixes the slug; every later phase, and/roster-run's resume check, MUST derive the byte-identical slug or the ledger will not be found. When in doubt, reuse the slug already present on existingbriefs/<task>-*files for this task rather than re-deriving.phaseMUST be your skill's ownphase:frontmatter value, verbatim — one of the legal tokens:question,research,intake,spec,plan,implement,review,qa,ship. Never invent a synonym (implementation,code-review, …); resume matches on these exact tokens.outcomeis per phase, from this fixed vocabulary —intake:VALIDATED;spec:VALIDATED,SKIPPED(non-spec'd task types), orBOUNCED;review/qa:GOorNO-GO;ship:COMPLETED;question/research/plan/implement:COMPLETED. Do not invent other values.- Append-only audit trail. Always push a new event — never rewrite or delete a prior one.
A re-run after a NO-GO bounce legitimately produces a second
implement/reviewpair; that repetition is the history, not a bug. Setcurrent_phaseto your phase (the latest completed). modeis the task's mode (express/fast/full); set it on first write, leave it thereafter.- Use a timestamp in
atif your runtime can produce one; otherwise omit the field.byis your skill name (orhuman-gatefor a gate decision).
Improvement Loop Planner
Pair: this skill proposes loops; /improvement-loop executes them. Run this first when you don't have a loop spec yet — once the human approves a proposal, pass it as $ARGUMENTS to /improvement-loop.
Propose a small set of high-value, bounded improvement loops for the project or area described in $ARGUMENTS.
This skill is KB-aware but not KB-dependent:
- If
kb/exists, use it as the highest-priority source of intent and constraints - If no KB exists, fall back to repository evidence: tests, CI, issues, TODOs, docs, and code structure
Your job is to discover and define candidate loops, not to run them.
Steps
1. Read Intent And Constraints
- Read
AGENTS.md,README.md,CLAUDE.md, and any architecture/spec docs that exist - If
kb/exists, read at least:kb/spec.mdif presentkb/properties.mdif presentkb/architecture.mdif presentkb/index.mdif present
- Extract:
- desired behavior
- non-negotiable constraints
- forbidden areas
- quality properties
2. Inspect Mechanical Signals
Prioritize evidence that can support a deterministic loop:
- failing tests
- flaky tests
- lint/type/build failures
- CI failures
- coverage gaps
- performance hotspots with measurable baselines
- issue backlog with concrete acceptance criteria
- repeated TODO/FIXME clusters
- code areas that appear inconsistent with KB or repo docs
If tools exist, use them. Examples:
gh issue listgh run view --log-failed- project test command
- lint/typecheck/build commands
3. Identify Candidate Loop Targets
A good loop target has all of these:
- narrow writable scope
- measurable success signal
- low to moderate blast radius
- repeatable verification command
- clear keep/discard decision
Bad loop targets include:
- vague “improve architecture”
- unbounded refactors
- subjective UI polish with no acceptance criteria
- changes that require irreversible side effects
Guardrail — never use model consensus as the completion signal. A loop's success/exit signal must come from an external, mechanical verification (tests pass, a metric crosses a threshold, a check exits 0) — not from an LLM (or N copies of one model) judging its own work "done." Agreement among same-model passes is circular: they converge on a confidently-wrong "finished" with no outside ground truth. (Adversarial review by a different model is fine as a finding generator; it is not a completion oracle. The done-condition stays mechanical.) This is the one durable lesson from "until-done" autonomous loops — bound the loop by a verifiable signal, never by self-assessed consensus.
Proposal Format
Propose 1 to 5 loops, ordered by expected value and safety.
For each loop, use exactly this structure:
## Loop <N> — <short name>
- Objective: <what this loop is trying to improve>
- Why now: <evidence from KB, tests, issues, CI, or code>
- Confidence: high | medium | low
- Writable scope: <specific files, directories, or globs>
- Read-only context: <files/docs/tests/issues to consult but not modify>
- Metric: <single primary metric or binary pass condition>
- Verify: `<command>`
- Guard: `<command or none>`
- Max iterations: <integer between 3 and 5; pick lower for high-risk scopes>
- Risk: low | medium | high
- Keep rule: <when a change is kept>
- Discard rule: <when a change is reverted or abandoned>
- KB basis: <spec/properties/architecture refs, or “none”>
After the proposals, add:
## Recommendation
- Best starting loop: <Loop N>
- Why: <why this one is the best first candidate>
- Missing setup: <anything the user should define before execution, or “none”>
## Tool Opportunities
For each loop proposed above, identify patterns that could become deterministic tools
instead of LLM judgment. Optional section — only include if a genuine opportunity exists.
Format:
[TOOL]
Examples:
- [TOOL] Custom linter rule for missing auth guards — replaces: reviewer manually checking auth on each new endpoint
- [TOOL] Schema diff checker — replaces: LLM comparing API responses to spec definitions
Rules
- Do not start editing code as part of this skill
- Do not propose an unbounded loop by default; bounded loops only
- Prefer deterministic metrics over subjective judgment
- If no trustworthy verification signal exists, say so explicitly and refuse to propose an execution-ready loop
- If KB exists and conflicts with issues or code, note the contradiction instead of papering over it
- If confidence is low because KB is absent, say that explicitly
When to Go Back
| Condition | Action |
|---|---|
| No measurable verification signal exists for any candidate | Stop — do not propose a loop; report to human |
| KB contradicts the proposed improvement area | Stop — surface the contradiction, do not paper over it |
What Next
Primary path: pass the approved loop spec as $ARGUMENTS to /improvement-loop.