finish - SKILL.md Agent Skill

name: finish description: End-of-task pipeline — simplify review, double review, tests, single analyze+test gate, changelog/docs. Use when you've finished implementing a task and want to harden it before committing. Does NOT commit or push; the user must ask explicitly.

Finish Pipeline

Run this once at the end of a task. The pipeline merges the old simplify / double-review / full-coverage-tests / changelog-docs skills into a single flow with a single analyze+test gate at the end.

Ground rules

Collect the diff once at Phase 0; reuse it across phases.
Do NOT run flutter analyze or flutter test between phases — only Phase 5 runs them.
Do NOT commit or push unless the user explicitly asks.
Do NOT create new .md files unless the user explicitly asks.
Fix findings inline within the phase that surfaced them; don't park lists of "TODO later".

Comment style (enforced project-wide)

English only. No Russian / Cyrillic in ///, //, or /* */ blocks. UI strings stay in .arb; this rule is about code comments.
Only WHY, never WHAT. A comment exists to explain a hidden constraint, a subtle invariant, a workaround, a non-obvious algorithm choice, or a refactor note. Names and types already cover what.
No comment > bad comment. If you can't say something the code can't, delete the comment.
One short line. Avoid multi-paragraph dartdoc. If a public API truly needs prose, keep it ≤2 lines.
No section dividers, no banners. // ===== Foo ===== and // --- bar --- are noise — drop them.
No TODOs without a tracker reference. A bare // TODO: fix this rots; tie it to an issue or remove it.
Translate, don't preserve. When touching a file with Russian comments, translate the salvageable ones to English and delete the rest as part of the same diff.

Phases

Phase 0 — Snapshot

git status --short
git diff HEAD

Enumerate changed files once. If the diff is empty, stop with "nothing to finalise".

Phase 1 — Simplify

Review new/changed code from three independent angles. This phase is self-contained — do not delegate to a built-in /simplify skill, which may not exist in every Claude Code version. Run the review yourself using these three lenses:

Reuse
- Search the codebase for existing utilities, DAO methods, helpers, or constants that could replace newly written code. Common locations: lib/shared/, sibling files of the changed file, existing DAOs.
- Flag new functions that duplicate existing functionality.
- Flag inline logic that could use an existing helper (string manipulation, path handling, type guards, platform checks).
Quality
- Redundant state, cached values that could be derived.
- Parameter sprawl (adding N-th param instead of restructuring).
- Copy-paste with slight variation → unify with shared abstraction.
- Leaky abstractions, stringly-typed code where enums/consts exist.
- Unnecessary wrapper widgets/Boxes that add no layout value.
- Nested conditionals 3+ levels (flatten with early returns, lookup tables, if/else-if cascade).
- Unnecessary comments explaining WHAT (delete; keep only WHY for hidden constraints, subtle invariants, workarounds).
Efficiency
- Redundant computations, duplicate network/DB calls, N+1 patterns.
- Missed concurrency: independent awaits in sequence → Future.wait.
- Hot-path bloat (startup, per-render, per-request).
- No-op store updates in polling loops / event handlers (add change-detection guard).
- TOCTOU anti-pattern (pre-checking existence then operating) → operate directly and handle the error.
- Unbounded data structures, missing cleanup, event listener leaks.
- Reading whole files when a slice is enough.

For large diffs (≥5 changed files or ≥300 lines), dispatch the three angles as parallel Agent sub-tasks (subagent_type: Explore) in a single message — each agent gets the full diff and reports its findings in ≤300 words. Then aggregate and fix. On small diffs, review all three angles yourself inline.

Fix actionable issues. Skip false positives — don't argue, just move on. Out-of-scope issues (pre-existing cruft in untouched files) get noted in the final report, not fixed here.

Phase 2 — Double review

R1 — correctness

Logic, edge cases, strict typing (no dynamic, no var in public API, final / const where possible, nullable handled via ?. / ?? / checks).
Error handling (no silent failures).
SharedPreferences per-profile rule: user-specific keys MUST be suffixed with '${profileId}' read via a getter (not cached in a field — profile can change). Global keys (API creds, theme, language) are the exception.
Cross-platform: no hardcoded Windows paths, package:path for joins, Platform.isX branches fall through for all supported OS (Windows + Linux + Android), Windows-only plugins (webview_windows) gated by kVgMapsEnabled.
New enum value propagation (only if the change adds a MediaType / CanvasItemType value): run flutter analyze to surface exhaustive-switch errors across canvas_view.dart, all_items_screen.dart, collection_screen.dart, export_service.dart, import_service.dart, plus filter chips, CollectionStats counts, collectedXxxIdsProvider, and all localisation keys (unknown*, allItems*, collectionFilter*, mediaType*, searchSource*).
DB migrations (any schema change): create*Table in schema.dart is shared between fresh installs (createAll) and the historical migration that first created the table — so any column or index baked into a create*Table is also re-applied when an old DB jumps many versions in one upgrade (the create runs with today's schema, then the historical ALTER / CREATE INDEX runs on top). A raw statement then throws duplicate column name / index already exists and the whole upgrade rolls back, leaving the app frozen on the splash. Rules:
- Adding a column: update the relevant create*Table in schema.dart (fresh installs need it) and add a new migration that calls Migration.addColumnIfAbsent(db, table, column, columnDef) — never a bare ALTER TABLE ... ADD COLUMN.
- Creating an index inside a migration: always CREATE INDEX IF NOT EXISTS (and CREATE UNIQUE INDEX IF NOT EXISTS).
- Never retro-edit or "freeze" an existing migration to dodge a collision, and never split create*Table into per-version copies. The idempotent ALTER / IF NOT EXISTS index is the sanctioned pattern — it's always valid to apply boldly because it's a no-op when the target already exists.
- Bump version in _initDatabase() and register the migration in MigrationRegistry.all.

R2 — quality and performance

Readability (function size, single responsibility, self-documenting names).
Duplication (shared logic extracted).
Memory / leaks (subscriptions cancelled, controllers disposed).
Flutter specifics (const widgets, ListView.builder for long lists, no object creation in build()).

R3 — localisation

Every UI string uses S.of(context).key or final S l = S.of(context);.
ARB: every key exists in both lib/l10n/app_en.arb and lib/l10n/app_ru.arb; placeholder names match; Russian plurals use ICU =0 / =1 / few / other.
Enum labels via localizedLabel(S l) extensions, not raw .displayLabel.
Status labels adapt to media type (Playing for games, Watching for movies/TV).
Allowed English: debug screens, debugPrint, model field names, enum .name, test assertions.

Phase 3 — Tests for new code

Goal: useful tests, not coverage theatre. 100% line/branch coverage is not the target. Aim for tests that break only when real behaviour breaks — and that would catch a future regression you'd actually care about.

Every test must pull its weight in one of three buckets:

1. UI doesn't silently break

Widget renders without exceptions (expect(tester.takeException(), isNull)).
Critical flows work: tap handlers fire the right callback, navigation happens, conditional widgets show/hide on state change, lists render the right item count.
Do NOT test colours, text labels, icons, font sizes, padding, border radius, widget types used purely for styling (e.g. find.byType(Container)). Design changes must not break tests. Localised string values — same rule: assert that some text appears, not that it equals a specific string.
Data-driven text that flows from model → UI is fair game (expect(find.text(collection.name), findsOneWidget) is OK; expect(find.text('Collections'), findsOneWidget) for a static title is not).

2. Logic is verified reliably

For every public method / function: happy path + each meaningful branch (if/else, switch cases, early returns, error paths).
Edge cases the change actually cares about: empty input, null, boundaries, concurrent-state races where relevant. Skip defensive null-guards that can't trigger in practice.
Model serialisation round-trips (fromJson/toJson, fromDb/toDb) when the change touches models.
copyWith semantics when a new field is added.

3. Method calls at the boundary are verified

When the change orchestrates multiple collaborators (DAO, repository, API client, provider), use verify(() => mock.method(args)).called(N) to pin down that the right method was called with the right args, the right number of times.
Use verifyNever to assert negative-space guarantees (e.g. "no tag remap when sourceTagId is null").
Use captureAny() to inspect complex payloads (e.g. "the cloned row has tag_id: null").

Infrastructure rules (non-negotiable):

import '../../helpers/test_helpers.dart' — reuse mocks from test/helpers/mocks.dart, builders from builders.dart, fallbacks via registerAllFallbacks() in setUpAll.
Mock/builder will be used in ≥2 files → add it to helpers. Unique to one test → declare locally.
Widget tests use tester.pumpApp(), not a hand-rolled ProviderScope + MaterialApp.
Naming: should [expected result] when [condition].

Self-check before finishing a test: "If someone changed the design tomorrow (colours, labels, layout) — would this test fail?" If yes, and the change wasn't a logic change, the test is overfitted. Remove or relax it.

Phase 4 — Changelog + docs

CHANGELOG.md — [Unreleased] section, Keep a Changelog version headers (## [Unreleased] / ## [X.Y.Z] - date) with Added / Changed / Fixed / Removed sub-sections; inside each sub-section, entries follow GNU Change Log style.

Entry structure — three parts separated by blank lines:

Topic line — one bolded sentence summarising the change (past-tense or imperative, like a commit subject).
Body (optional, 1–3 sentences) — what the user sees and why. Plain prose, no file paths. Skip if the topic line is self-explanatory.
File list — bulleted index of affected files in the form * path/to/file.dart (ClassName.methodName, OtherSymbol): what changed. Use full paths from the repo root and full symbol names (never abbreviate, never group with {foo,bar} syntax — every symbol must be greppable on its own). Several files with identical descriptions can be combined on one line separated by commas. A file with no specific symbol worth naming can be listed as * path/to/file.dart: what changed.

Rules:

English entries. No Russian abbreviations or Cyrillic shorthand in English prose (write "right-click", not "ПКМ"; "left-click", not "ЛКМ"). Russian strings quoted as UI labels ("«Желаемое»") are fine — that's data, not prose.
Separate unrelated topics with a blank line (already enforced by Markdown list spacing).
Unreleased consolidation: when enhancing or fixing something already in [Unreleased] that was never released, update the existing entry in place — don't add a separate Fixed / Changed bullet. Users should see the final state, not the development history. This applies to the topic line, body, and the file list alike.
If a single topic spans many unrelated files (≈30+), it probably bundles several changes — split into multiple topic entries rather than letting one file list balloon.

Example:

- **Expand AniList search filters for anime and manga**

  Anime tab grows from 2 filters to 4; manga from 2 to 4. Multi-select
  genre uses OR match. Year filter uses `startDate` bounds so it works
  for older and cancelled titles where `seasonYear` is null.

  * lib/core/api/anilist_api.dart (AniListApi.browseAnime, AniListApi.browseManga):
    Change `$genre: String` → `$genres: [String]`; add `$format`, `$status`,
    `$startDateGreater`, `$startDateLesser` GraphQL vars.
  * lib/features/search/filters/anilist_anime_format_filter.dart
    (AniListAnimeFormatFilter), anilist_manga_status_filter.dart
    (AniListMangaStatusFilter): New.
  * lib/features/search/filters/manga_format_filter.dart (MangaFormatFilter.options):
    Limit to MANGA, NOVEL, ONE_SHOT — MANHWA / MANHUA / LIGHT_NOVEL were
    rejected by AniList's `MediaFormat` enum.

docs/ — update only if the change actually affects them:

File	When to touch
`ARCHITECTURE.md`	New layer, major module, or shift in patterns. Keep it a high-level map — do NOT add per-file tables or SQL schema dumps
`CONTRIBUTING.md`	Changes to development process
`CODESTYLE.md`	New lint rules, typing conventions
`COMMITS.md`	Changes to commit conventions
`RCOLL_FORMAT.md`	Changes to `.xcoll` / `.xcollx` export format
`GAMEPAD.md`	New focusable widgets, navigation rules
`SNACKBAR.md`	Changes to `context.showSnack()` API or types

Language per file: most are Russian. Keep each in its current language. Preserve formatting — make targeted edits, don't rewrite.

Phase 5 — Gate (single run)

powershell.exe -Command "cd '$(wslpath -w "$PWD")'; flutter analyze --fatal-infos --fatal-warnings"
powershell.exe -Command "cd '$(wslpath -w "$PWD")'; flutter test"

Follow the failure-recovery rules below. When green, STOP — report what changed and wait for an explicit commit/push instruction.

Failure recovery

Failure	Response
Analyzer fails	Fix inline. Re-run analyzer only. Do not re-do earlier phases.
Test fails — a test I just wrote	Fix the test (wrong mock stub, missing fallback, wrong assertion). Re-run tests only.
Test fails — existing test	Default: the test is right, the production code is wrong. Do not edit the test yet. First, re-read the test and the code paths it covers. Ask: "Was this specific behaviour something I deliberately changed as part of the task?" Answer this honestly before touching anything. → If NO (surprise failure, behaviour change you didn't plan): back to Phase 1 — the code is wrong, fix the code, then Phase 3 for the affected area, then re-gate. → If YES (the old assertion contradicts the intended new behaviour, and the new behaviour is in the spec/user request): update the test, rerun tests. Document the behaviour change in the final report so the user sees what shifted. If unsure, default to NO.
Review (Phase 1-2) needs a code change	Fix inline. If the fix touches production code (not just comments/docstrings), add/update tests in Phase 3 before re-gating.
R3 reveals missing ARB keys	Add keys to both `app_en.arb` and `app_ru.arb`, run `powershell.exe -Command "cd '$(wslpath -w "$PWD")'; flutter gen-l10n"`, re-run analyzer.
Flaky test	Retry the affected test file once via `flutter test path/to/test.dart`. If it still fails, treat it as real.

Anti-loop rule — if the same error has been attempted twice with different fixes and still fails, STOP and report to the user. Do not keep hacking.

Scope creep — unrelated issues discovered during review (pre-existing bugs, cruft in untouched files) are NOT fixed here. Note them in the final report; user decides.

Report format

One line per phase as it completes, e.g.:

Phase 1 simplify: 2 fixes (removed redundant setItemTag write, inlined unused helper).
Phase 2 R1 critical fix: SQLite LOWER() doesn't handle Cyrillic → switched to Dart toLowerCase. R2/R3 clean.
Phase 3 tests: 20 new tests across 3 files, all green.
Phase 4 changelog: 1 Changed entry; no docs/ touched.
Phase 5 gate: analyze clean, 4776 tests passed.
Risk: Low — re-downloadable cache only, no permanent data loss; affects the Settings cache button.
Status: ready for commit. Awaiting explicit /commit.

Phase 6 — Risk assessment & impact (always, after the gate)

A green gate proves the code compiles and the tests pass — it does not prove the change is safe to put in front of a real user. Close every run with a short, honest risk assessment so the user can decide whether to ship. Never rubber-stamp; if you can't find a risk, say why the change is inherently safe (pure addition, read-only, behind a flag), don't just assert "looks fine".

Cover four points, one or two lines each:

Blast radius — what it affects. The features, flows, screens, files, and stored data this change touches, and who is hit: all users, one platform (Windows / Android), or only users with a precondition (feature X enabled, a custom data folder, a specific profile, a populated cache/DB). Name the concrete surface, not "the app".
End-user safety. Can it lose or corrupt user data, delete files, break a DB migration, or wedge the app on the splash screen? Is the effect reversible (undo, re-download, re-sync from source) or permanent? Anything irreversible is called out explicitly. If the change deletes or overwrites anything on disk or in the DB, this point is mandatory, and you must distinguish destructive-but-recoverable (e.g. re-downloadable cache) from destructive-and-permanent (e.g. the only copy of a user upload).
Worst-case failure. If the change is subtly wrong despite the tests, what does the user actually experience — cosmetic glitch → broken flow → crash → silent data loss — and how likely is that given what the tests in Phase 3 actually pin down. Be specific about which failure modes the tests do not cover.
Bottom line. One rating — Low / Medium / High risk — with a one-sentence justification, plus any "watch this in the wild" notes: edge cases left untested, platform-specific behaviour, interactions with the data folder / profile / cache / migration chain.

Keep it factual and scoped to this diff. The bottom-line rating goes on the Risk: line of the report (see above); the four points expand it underneath when the change is anything more than trivial.