task - SKILL.md Agent Skill

description: Work a task end-to-end with lean context gathering, implementation, and verification argument-hint: '[task description | issue id/link]' disable-model-invocation: true name: task metadata: skiller: source: .agents/rules/task.mdc

Work Task

Handle $ARGUMENTS. Start from the source of truth, load extra skills only when they earn their keep, and verify before calling the task done.

#$ARGUMENTS

Core Rules

Read the task source first.
Read local repo instructions and nearby implementation patterns before editing.
Search for existing patterns before inventing new ones.
Prefer the best durable ownership fix over the smallest local patch.
Prefer targeted tests and checks during iteration.
Keep the user updated at milestones.
Verify the actual result before claiming done.
Do not default to research swarms, review swarms, browser proof, PRs, tracker comments, or compounding.
Before calling a task blocked on a repo-wide gate, rule out local install corruption once when the failure smells wrong for the diff.

Intake

Classify the input:
- Plain task text: the user prompt is the source of truth.
- File path or spec path: read it first.
- GitHub issue URL: fetch it with gh issue view first.
- GitHub PR URL: fetch it with gh pr view first.
- Bare GitHub issue like #555: resolve it against the current gh repo first, then fetch it with gh issue view.
- Linear issue link/id: fetch it with the Linear integration first.
Read the full source-of-truth context before doing anything else.
For tracker items, also read comments and attachments when available.
If tracker evidence includes video or screen recording, load video-transcripts, use or create the shared transcript cache through that skill, and require normalized <video-transcripts> XML before implementation. If the helper cannot produce it after a real attempt, stop and report the blocker.
Classify task shape:
- Testing or coverage work.
- Program or batch work.
- Ordinary one-shot work: bug, feature, refactor, docs, review, or investigation.
Classify heavyweight work:
- Heavyweight: architecture or public API redesign, breaking changes, major cross-package refactors, benchmarking, profiling strategy, scalability work, framework comparison, migration analysis, RFCs, proposals, or spec-first major changes.
- Non-heavyweight: ordinary bugs, one-package features, docs-only edits, routine test work, small refactors, or normal issue execution.
If heavyweight, load major-task immediately and let it own workflow.
If non-heavyweight, classify complexity:
- Non-trivial: multi-step, research-heavy, phased, or likely more than a few tool calls.
- Trivial: quick question, small edit, or work that does not need persistent working memory.
If non-trivial and measurable/auditable:
- load autogoal before implementation
- create or update one docs/plans goal plan from the dominant-risk primary template plus touched-surface packs:
  - docs-dominant work: --template docs
  - other normal work: --template task
  - supporting docs touched: add --with docs
  - .agents/**, .claude/**, .codex/**, skills, hooks, commands, prompts, or user-action tooling touched: add --with agent-native
  - browser/UI route or interaction touched: add --with browser
  - package exports, public API, release artifacts, or package boundary touched: add --with package-api node .agents/skills/autogoal/scripts/create-goal-scratchpad.mjs --template <task|docs> --with <pack> --title "<short task title>"
- follow local repo overrides for where planning files live
If testing or coverage work, load testing before tdd and choose the smallest honest slice.
If program or batch work, restate the ordered scope and finish one slice at a time unless the user asked for a broader sweep.
For any tracker source, record source type/id/title, task type, acceptance criteria, caveats, likely files/routes/packages, browser surface, and likely root-cause layer in the plan when a plan exists.
If code will change, decide branch handling before edits using repo policy; do not reuse an unrelated branch just because it is checked out.
If anything important is still ambiguous after the source and nearby code pass, ask the smallest useful clarifying question.

Tracker Rules

Apply only when the source is a tracker item.

Treat the tracker item as the source of truth.
Use the native tool for fetch and sync-back: gh for GitHub, Linear integration for Linear.
If useful, rename the thread to <issue-number> <issue-title>.
Prefer PR before tracker comment for verified code-changing work unless blocked or the user said not to.
Comment back only after a meaningful outcome, or when a blocker note helps the tracker owner.
Do not require PR creation, screenshots, or comments for analytical, blocked, or inconclusive work.

Load Skills Only When Justified

Skill Diet

Default to task for normal work and major-task for heavyweight work. Load a niche skill only when it owns a hard domain gate, command, or proof surface that the active task would otherwise miss.

Do not keep repo-local skills for generic lifestyle, app-template, local git ops, stale command stubs, or broad CE ceremony when task, major-task, autogoal, autoreview, or a Plate-specific skill owns the workflow better.

If a generated skill is gone but skills-lock.json still references it, remove it through npx skills remove <skill> -y first. If the CLI removes the agent files but leaves stale lock entries, record that evidence before cleaning the lock.

autogoal: measurable or auditable non-trivial work. Use the dominant-risk primary template and touched-surface packs: docs-heavy work gets --template docs, normal work gets --template task, and supporting docs, browser, agent-native, or package/API surfaces add matching --with <pack> rows. Review expectations stay in the primary template. Do not use root planning files, hooks, .planning/**, or docs/goals/**.
major-task: heavyweight architecture, framework, migration, benchmark, or proposal work.
testing: tasks primarily about tests, coverage, regression gaps, or suite phases.
tdd: bugs and feature work where behavior-level automated coverage is sane.
learnings-researcher: non-trivial repeated domains with documented solutions.
debug: fuzzy failures after the first repro pass or failing test.
video-transcripts: tracker evidence contains a video or screen recording.
If requirements remain ambiguous after source and local context, ask the smallest clarifying question or switch to a planning goal when the user wants planning.
framework-docs-researcher: unfamiliar, version-sensitive, or unstable third-party APIs after local clones/docs are checked.
browser-use: real browser/UI surface needs verification.
agent-browser-issue: browser automation is blocked by a reusable tool-side issue.
changeset: published package work under packages/ needs release notes.
plate-ui: authoring or refactoring Plate registry UI/components, static/live renderers, kits, registry wiring, or ownership/extraction decisions under apps/www/src/registry/**.
docs-creator: non-trivial docs/content work, new or rewritten pages, plugin/API/spec/serialization docs, route moves, example changes, or docs with source-backed ownership/API claims. Use --template docs when docs dominate; use --with docs when docs are a supporting touched surface under a normal or major task.
git-commit-push-pr: verified code should ship as a PR.
Review skills: load only for risky, large, user-facing, or architecture-sensitive changes.
agent-native-reviewer: changes touch .agents/**, .claude/**, AI/tooling surfaces, commands, or user actions an agent should perform.

Review And Risk Gates

Keep this lighter than Slate Plan. A normal task should not grow a scorecard, issue ledger, or pass calendar, but it still needs real closeout pressure when the patch is risky.

Autoreview is a hard closeout gate for non-trivial implementation changes. Load .agents/skills/autoreview/SKILL.md, pick the target from the actual diff state, and keep going until there are no accepted/actionable findings. Use dirty local --mode local, branch/PR --mode branch --base <base>, and committed-slice --mode commit --commit <ref>.
agent-native-reviewer is required when the task changes .agents/**, .claude/**, .codex/**, skills, hooks, commands, prompts, or user-action tooling. Treat accepted findings like normal review findings: verify against source, fix the real issue, and rerun the relevant proof.
Source authority is workspace-local. A check run in the planning repo cannot prove behavior owned by a sibling repo, package, app, browser route, or tracker system. Record the cwd/tool that owns each proof.
For public API, runtime, package-boundary, browser behavior, agent-action, or command-contract changes, add a compact high-risk note before closeout: realistic failure mode, proof plan, and why the chosen boundary is still the right one.
Trivial docs, wording, and no-local-patch tasks may mark these gates N/A with a reason. Do not run review theater for a typo.

Execution Path

Bug

Reproduce first when possible.
Add a behavior-level regression test when sane.
Fix the real ownership boundary, not every caller around it.
If the best fix requires an API change, make it unless task constraints rule it out.
Re-run targeted checks and browser flow only when the bug lives there.

Feature

Reduce the task to the smallest slice that satisfies acceptance criteria.
Add behavior coverage when sane.
Prefer the cleanest long-term design that fits the slice.
Verify the user-facing outcome.

Testing Or Coverage Work

Use the testing policy before choosing files or commands.
Pick the smallest honest hotspot or ordered slice.
Add or deepen focused tests instead of broad smoke coverage.
Verify with targeted commands first.

Program Or Batch Work

Respect explicit order.
Define done for the current slice before implementation.
Complete one slice cleanly unless the user asks for a broader sweep.

Refactor Or Chore

Preserve behavior.
Do not do fake TDD theater.
Improve bad APIs or abstractions when that is the real fix.
Run the narrowest regression checks plus relevant build, typecheck, or lint.

Docs Or Content

Skip engineering ceremony.
For non-trivial docs, load docs-creator and use the docs goal template.
If docs are only a supporting touched surface on another task, add the docs pack instead of switching the primary template.
For tiny copy edits, skip the docs goal and keep verification proportional.
Verify links, examples, formatting, source-backed claims, and rendered output as appropriate.

Review Or Investigation

Read relevant diff, files, and surrounding context first.
For reviews, report findings first, ordered by severity.
For investigations, identify failure mode, probable cause, and next action before changing code.
Only implement changes if the user asked for them.

Verification

Keep verification mandatory and proportional.

Run targeted tests for changed behavior.
Run package/app build and typecheck when relevant.
Run lint when code changed and repo policy expects it.
Run browser verification only for browser or UI tasks.
Run broader repo-wide gates only when repo instructions or change scope justify them.
Run verification in the workspace, package, app, route, or external system that owns the changed behavior; record the cwd when that is not obvious.
Close the high-risk note before final handoff when the task changes public API, runtime, package boundaries, browser behavior, agent actions, or command contracts.
For non-trivial implementation changes, run the autoreview gate selected from the actual diff state and close all accepted/actionable findings before final.
For agent/tooling changes, run the agent-native review gate or record why it does not apply.
If pnpm test, bun test, or pnpm check fails with local-corruption signals unrelated to the diff, run pnpm run reinstall once and rerun the exact failing command before declaring the task blocked.
If work changes published packages, satisfy the changeset gate.
If work is registry-only under apps/www/src/registry/, satisfy the registry changelog gate instead of a package changeset.
If verified work changed code, create or update the PR before tracker sync-back unless the user said not to.
If the task came from a tracker item and reached a meaningful outcome, sync back unless the user said not to.

Final Handoff

Be extremely concise.
Report PR, issue/tracker, confidence, tests, browser proof, outcome, caveats, design choice, and verification only when applicable.
For non-trivial task goals, close every relevant task-template gate before the final response.
If a PR exists, keep the PR description synced to the final handoff.
For tracker comments, write for QA or the issue owner, not for internal implementation history.

Task-Style PR Body

When a task run creates or updates a PR, the PR description must mirror the task final handoff. Do not use a generic Summary / Verification PR body, an adaptive prose body from git-commit-push-pr, or a generated badge footer unless the caller or repo template explicitly asks for it.

Use the accepted task PR format from kitcn PR #270. The shape is not optional:

Preserve any existing  block. If a changeset is part of the diff and repo policy expects auto release, include that block.
Use an emoji-prefixed issue/tracker/fix line, for example 🐛 Fixes #123 or 🐛 Fixes ➖ N/A. Never include a line that links to the current PR itself; the current PR URL belongs in the final response, not in its own description.
Use an emoji confidence line, for example 🟢 95-100% confidence.
Use this exact table header: | Phase | 🧪 Tests | 🌐 Browser |
Use Reproduced and Verified rows. Mark passing proof with 🟢, repro or failing proof with 🔴, and non-applicable browser/test cells with ➖ N/A.
Use bold emoji section headings exactly in this family: **✅ Outcome**, **⚠️ Caveat**, **🏗️ Design**, and **🧪 Verified**.

The body should tell QA/reviewers what was fixed, how it was reproduced, how it was verified, and why the chosen ownership boundary is right. It must not use plain Fix:, plain Confidence:, ## Outcome, ## Verified, or a generic Summary / Verification shape for task-run PRs. After editing, verify it with gh pr view --json body before final handoff.

Success Criteria

Source-of-truth context was read first.
Relevant repo instructions and patterns were read before editing.
Tracker items were fetched and summarized correctly when provided.
Video evidence used video-transcripts before implementation when required.
Bare GitHub issues like #555 were resolved against the current repo.
The chosen fix addressed the highest-leverage ownership boundary available.
Non-trivial measurable work loaded autogoal and used the right goal template.
Non-trivial docs work loaded docs-creator and used --template docs.
Supporting docs, browser, agent-native, package/API, or extra-review surfaces used the matching --with <pack> rows when they were not the dominant risk.
Testing work loaded the testing policy before implementation.
Only necessary skills were loaded.
Batch work did not sprawl without explicit instruction.
Verification matched change scope.
PR descriptions created by task runs used the kitcn PR #270 emoji task-style body and were verified with gh pr view --json body.
Final handoff matched the task type and any task-template gate evidence.