name: retro description: Review the current session, surface process improvements and tangential issues, and create follow-up tasks
Retrospective Skill
Reviews the current conversation history to capture process learnings, instruction improvements, and tangential issues. Creates structured follow-up tasks so nothing falls through the cracks.
Use
/create-taskfor task creation — handles decomposition, deduplication, criteria, and deps. Usetusk task-insertonly for bulk/automated inserts.
Step 0: Setup
Fetch config, backlog, then determine retro mode. Also capture the most recent Done task's ID so the retro's cost can be attributed to it:
tusk "SELECT id, complexity FROM tasks WHERE status = 'Done' ORDER BY updated_at DESC LIMIT 1"
tusk setup
Parse the JSON from tusk setup: use config for metadata assignment and backlog for duplicate comparison.
Store the id from the first query as RETRO_TASK_ID — it's the single just-closed task this retro is reviewing. Then start cost tracking:
tusk skill-run start retro --task-id $RETRO_TASK_ID
This prints {"run_id": N, "started_at": "...", "task_id": N}. Capture run_id — it's referenced by every exit path below. If the query returned no rows (no Done tasks exist yet), omit --task-id:
tusk skill-run start retro
Early-exit cleanup: If any step below causes the retro to stop before reaching the final report (LR-3 for lightweight, Step 6 for full), first call
tusk skill-run cancel <run_id>to close the open row, then stop. Otherwise the row lingers as(open)intusk skill-run listforever.
Step 0b: Cross-retro theme check
Fetch themes recurring 3+ times across approved findings in the last 30 days so this retro can see patterns that any single retrospective misses:
tusk retro-themes --window-days 30 --min-recurrence 3
The output is pre-aggregated {theme, count} tuples — do not issue separate SQL queries against retro_findings. All cross-retro aggregation belongs in SQL behind tusk retro-themes; /retro consumes only the tuple stream.
If themes is empty, skip — the current session stands alone. If any tuple is returned, store the list as $RECURRING_THEMES and use it in LR-1 (or Step 3 in FULL-RETRO) to flag recurrences: when this session surfaces a finding whose summary text contains a recurring theme, note the recurrence (e.g. theme 'manifest' recurring — seen N times in last 30 days) next to that finding in the report. Themes are normalized topic terms extracted from retro_findings.summary (issue #551), not single-letter category codes.
- XS or S → follow the Lightweight Retro path below
- M, L, XL, or NULL → read the full retro guide:
Then follow Steps 1–6 from that file. Do not continue below.Read file: <base_directory>/FULL-RETRO.md
Lightweight Retro (XS/S tasks)
Streamlined retro for small tasks. Skips subsumption analysis and dependency proposals.
LR-1: Review & Categorize
Read mid-task jots first. If RETRO_TASK_ID is set, fetch any friction notes captured during the task via tusk jot:
tusk jots --task-id $RETRO_TASK_ID
The output is an array of {id, skill_run_id, task_id, category, note, file_hint, skill_hint, created_at} rows. Each jot is a pre-classified finding candidate captured at the moment of friction — treat its category as a strong hint when bucketing into the categories below, and quote the note verbatim in the finding's summary. Jots are the highest-fidelity input to retro because they were not reconstructed from memory at close time. Empty array → no jots were filed; proceed using conversation context alone.
Check for custom focus areas first. Attempt to read <base_directory>/FOCUS.md.
- If the file exists: use the categories defined in it for the analysis below.
- If the file does not exist: use the default categories:
- Category A: Process improvements — friction in skills, AGENTS.md, tooling
- Category B: Tangential issues — bugs, tech debt, architectural concerns discovered out of scope
- Category C: Follow-up work — incomplete items, deferred decisions, edge cases
- Category D: Lint Rules — concrete, grep-detectable anti-patterns observed in this session (max 3). Only include if an actual mistake occurred that a grep rule could prevent — e.g., calling a deprecated command, using a wrong pattern in a specific file type. Do NOT include general advice or style preferences.
- Category E: Debugging Velocity — only if the session involved fixing a bug or diagnosing unexpected behavior. Reflect on: (1) what information was missing that delayed diagnosis; what tool, log, or trace would have surfaced the root cause immediately; whether a test would have caught this before it became a bug. (2) Did fixing this bug change the conditions under which adjacent issues matter? (e.g., removing noise that was masking a separate signal, raising the quality bar in a way that exposes nearby gaps.) If so, those adjacent issues are in scope for this category even if they predate the session — "predated the session" is not sufficient grounds for dismissal when the fix elevated their relevance. If no bug was present, this category is empty. Findings must be concrete (tasks or skill/AGENTS.md patches) — not generic advice like "add more logging."
Analyze the full conversation context using the resolved categories.
If all categories are empty, run tusk skill-run cancel <run_id>, report "Clean session — no findings" and stop. (Config and backlog were already fetched in Step 0 — no additional work needed.)
LR-1b: Classify Each Finding
For each finding, determine whether it is a tusk-issue or a project-issue:
- tusk-issue — a bug, limitation, or improvement in tusk itself: the CLI, a skill, DB schema, or installed tooling (e.g., a skill instruction is confusing, a
tuskcommand misbehaves, a missing feature in the tool) - project-issue — specific to the current project: its code, architecture, conventions, or processes
Label each finding with its classification. This drives the routing in LR-2.
LR-2: Create Tasks / File Issues (only if findings exist)
Compare each finding against the backlog for semantic overlap (use
backlogfrom Step 0). Drop any already covered.Run heuristic dupe check on surviving findings:
tusk dupes check "<proposed summary>"Present findings and proposed actions in a table (include the classification from LR-1b). Wait for explicit user approval before acting.
For each approved finding, route based on its LR-1b classification:
tusk-issues — file a GitHub issue via:
tusk report-issue --title "<finding title>" --cluster triage-needed --context "<finding description>"Choose a more specific cluster (
worktree,merge,review-diff,summary,docs,test-precheck, orsmall-fix) when it is clear from the finding; usetriage-neededonly as the fallback. Do not calltusk task-insertfor tusk-issues. Track the count of issues filed for LR-3.Include a
## Failing Testsection in--contextwhenever a concrete test can be derived from the finding. This matters because/address-issueFactor 0 treats a missing failing test as the highest-priority signal to Defer — issues filed without one will be deprioritized automatically. Format:<finding description> ## Failing Test <shell command that currently fails or demonstrates the bug>If no concrete test exists (e.g. a pure UX or documentation finding), omit the section rather than fabricating one.
If
tusk report-issueexits non-zero (e.g.,$TUSK_GITHUB_REPOis unset orghCLI is unavailable), fall back to inserting a tusk task instead:tusk task-insert "<finding title>" "<finding description> [Note: GitHub issue could not be filed — report-issue failed]" \ --domain skills --task-type chore --priority Low --complexity XS \ --criteria "File a GitHub issue for this finding once $TUSK_GITHUB_REPO is configured"Note in LR-3 that the issue was tracked as a local task rather than filed on GitHub.
project-issues — For Category A and Category E approved findings, follow LR-2a below before inserting tasks. For all other project-issue findings, insert tasks now:
tusk task-insert "<summary>" "<description>" --priority "<priority>" --domain "<domain>" --task-type "<task_type>" --assignee "<assignee>" --complexity "<complexity>" \ --criteria "<criterion 1>" [--criteria "<criterion 2>" ...]Always include at least one
--criteriaflag — derive 1–3 concrete acceptance criteria from the task description. Omit--domainor--assigneeentirely if the value is NULL/empty. Exit code 1 means duplicate — skip. Skip subsumption and dependency proposals.
LR-2a: Skill-Patch for Category A and Category E Findings (only if Category A or Category E findings exist)
Before creating tasks for Category A (process improvement) or Category E (debugging velocity) findings, check if any can be applied as inline patches to an existing skill or AGENTS.md.
Initialize an empty list $AUTO_APPLIED for the LR-3 summary — auto-apply gate hits in step 3 below append one line per applied edit.
For each approved Category A finding:
Classify the finding as rule-like or narrative:
- Rule-like: a single heuristic, invariant, or convention about how code or processes should work — e.g., "always quote file paths in zsh", "always pass
encoding='utf-8'tosubprocess.run(text=True)". These belong in the conventions DB viatusk conventions add. - Narrative/reference: multi-step procedures, workflow descriptions, explanatory context, or anything that requires more than one sentence to express correctly. These belong as a patch to a skill file or as a prose addition to AGENTS.md.
- Rule-like: a single heuristic, invariant, or convention about how code or processes should work — e.g., "always quote file paths in zsh", "always pass
If the finding is rule-like — propose adding a convention via
tusk conventions add: a. Draft the exact convention text (one concise sentence) and a comma-separated list of relevant topic tags. b. Present the proposal with three options:Convention Proposal — [finding title]
tusk conventions add "[concise rule text]" --topics "[tag1,tag2]"approve — run the command now (no task created for this finding) defer — create a task with this command included in the description skip — create a generic task as usual
c. If approved: run the command now using Bash. Do not create a task for this finding. d. If deferred: include the proposed command verbatim in the task description when calling
tusk task-insert. e. If skipped: proceed to normal task creation (step 4 in LR-2).If the finding is narrative/reference — identify a target file:
- A skill name matching a directory in
.agents/skills/(list them withls .agents/skills/) - The string
AGENTS.md
If a target file is identified: a. Read the file (
Read .agents/skills/<name>/SKILL.mdorRead AGENTS.md) b. Produce a concrete proposed edit — the exact text to add, change, or remove. Show the specific diff, not a vague description.c. Auto-apply gate (only for skill-frontmatter edits). Before showing the three-option prompt, read the
retroconfig once:tusk config retroThe command prints the
retroJSON object (or exits non-zero / prints nothing when the key is unset). Ifretro.auto_applyis not exactlytrue— i.e. the key is missing, the value isfalse/null, ortusk config retroprinted nothing — skip this gate entirely and proceed to step 3d (three-option prompt). Behavior must match the pre-auto-apply flow exactly when the feature is disabled. Whenretro.auto_applyistrue, also readretro.auto_apply_max_chars(default200if unset) for the size budget below.If
retro.auto_applyistrue, the proposed edit qualifies for auto-apply only when ALL of the following hold:- Frontmatter-only: every changed line lies inside the YAML frontmatter block (between the opening
---and closing---at the top of.agents/skills/<name>/SKILL.md), and each changed line is either adescription:line or a#-prefixed comment line. Body changes (anything below the closing---),name:/allowed-tools:/ other frontmatter fields, andAGENTS.mdedits never qualify. - Size budget: the total character count of the diff (sum of
old_stringlength +new_stringlength, or for additions just thenew_stringlength) is strictly less thanretro.auto_apply_max_chars(default 200). - Additive or character-level: either (a) the change is a pure addition —
new_stringextendsold_stringwith appended content and contains no deletions — or (b) the change is character-level on a single line — exactly one frontmatter line is modified, and the modification only inserts or replaces characters within that line (no full-line removals, no multi-line removals).
If any condition fails, fall through to step 3d (three-option prompt). Do not partially auto-apply.
If all conditions hold and
retro.auto_applyis enabled, apply the edit immediately using the Edit tool — skip the three-option prompt entirely. Record a one-line entry in$AUTO_APPLIED(file path + brief description, one entry per line) for the LR-3 summary, and do not create a task for this finding. Proceed to the next finding.d. Otherwise, present the patch with three options:
Skill Patch Proposal — [finding title] File:
.agents/skills/<name>/SKILL.md- [existing text to replace] + [replacement text]approve — apply the edit now (no task created for this finding) defer — create a task with this diff included in the description skip — create a generic task as usual
e. If approved: apply the edit in-session using the Edit tool. Do not create a task for this finding. f. If deferred: include the proposed diff verbatim in the task description when calling
tusk task-insert. g. If skipped, or if no target file was identified: proceed to normal task creation (step 4 in LR-2).- A skill name matching a directory in
LR-2b: Apply Lint Rules Inline (only if lint rule findings exist)
Apply this step if there are lint rule findings — Category D when using defaults, or a "Lint Rules" section when using a custom FOCUS.md.
The bar is high — only proceed if you observed an actual mistake that a grep rule would have caught. Do not apply lint rules for general advice.
For each lint rule finding, attempt inline application first:
Present the proposed rule — show the exact command and ask for approval:
Found lint rule candidate: [finding description] Command:
tusk lint-rule add '<pattern>' '<file_glob>' '<message>'Apply this rule now? (Reversible withtusk lint-rule remove <id>.)If the user approves — run the command immediately:
tusk lint-rule add '<pattern>' '<file_glob>' '<message>'- Success: note the rule ID returned. Do not create a task for this finding.
- Error or unavailable: fall back to task creation (step 3).
If the user declines, or if inline application fails, create a task as a fallback:
tusk task-insert "Add lint rule: <short description>" \ "Run: tusk lint-rule add '<pattern>' '<file_glob>' '<message>'" \ --priority "Low" --task-type "<task_type>" --complexity "XS" \ --criteria "tusk lint-rule add has been run with the specified pattern, glob, and message"
For <task_type>: use the project's config task_types array (already fetched via tusk setup in Step 0). Pick the entry that best fits a maintenance/tooling task (e.g., maintenance, chore, tech-debt, infra — whatever is closest in your project's list). If no entry is a clear fit, omit --task-type entirely.
Fill in <pattern> (grep regex), <file_glob> (e.g., *.md or bin/tusk-*.py), and <message> (human-readable warning) with the specific values from your finding.
LR-3: Report
The /tusk skill already printed the task summary block (tusk task-summary <id> --format markdown) immediately before invoking /retro, so the user has already seen the canonical identity/cost/duration/diff/criteria rollup for the just-closed task. Do not re-emit that block here — start directly with the retrospective findings so the two sections read as one continuous report.
## Retrospective Complete (Lightweight)
**Session**: <what was accomplished>
**Findings**: X total (by category — use resolved category names)
**Created**: N tasks (#id, #id)
**GitHub issues filed**: N (tusk-issues routed via tusk report-issue — omit line if zero)
**Lint rules**: K applied inline, M deferred as tasks
**Auto-applied**: P frontmatter edits — <one entry per item from $AUTO_APPLIED, format: `path/to/SKILL.md (brief description)`> (omit line if P == 0)
**Skipped**: M duplicates
Then show the current backlog:
tusk -header -column "SELECT id, summary, priority, domain, task_type, status FROM tasks WHERE status = 'To Do' ORDER BY priority_score DESC, id"
LR-3a: Record approved findings for cross-retro theme detection
Before closing the skill run, write one retro_findings row per approved finding (task created, issue filed, lint rule added, convention added, or skill-patched inline). Skipped/duplicate findings are not recorded — only actioned ones feed the cross-retro signal. For each approved finding, run:
tusk retro-finding add \
--skill-run-id <run_id> \
--category '<category>' \
--summary '<one-line summary>' \
[--task-id <RETRO_TASK_ID>] \
[--action-taken '<action_taken>']
<action_taken> vocabulary (pick whichever fits; omit --action-taken if none do):
task:TASK-<id>— a new task was created viatusk task-insertissue:<url>— a GitHub issue was filed viatusk report-issuelint:<id>— a lint rule was added viatusk lint-rule addconvention:<id>— a convention was added viatusk conventions addskill-patch:<file>— an inline edit was applied to a skill or AGENTS.mddocumented— recorded without a concrete action (e.g. noted for context)
Omit --task-id entirely when no RETRO_TASK_ID was captured in Step 0 — the wrapper stores a real SQL NULL. Do not pass --task-id NULL or --task-id "". Text fields are passed as normal argparse arguments; no $(tusk sql-quote ...) is required. The wrapper validates skill_run_id (and task_id if supplied) as real FKs before the INSERT, so a typo'd id fails fast with exit 1.
Finally, close out the retro skill-run so its cost is captured:
tusk skill-run finish <run_id>
End of lightweight retro. Do not continue to FULL-RETRO.md.
Customization
To override the default analysis categories, create a FOCUS.md file in the skill directory (replace <base_directory> with the actual path shown at the top of the loaded skill — typically .agents/skills/retro):
cp .agents/skills/retro/FOCUS.md.example .agents/skills/retro/FOCUS.md
# Edit FOCUS.md to define your custom categories
A template is available at <base_directory>/FOCUS.md.example showing the default category format. Custom categories replace A–D. Include a "Lint Rules" section to retain lint-rule handling.
FOCUS.md is not part of the distributed skill and will not be overwritten by tusk upgrade.