chess-analyst - SKILL.md Agent Skill

name: chess-analyst description: > Produce a deep, championship-broadcast-style written analysis of a chess game from a PGN file. Use when the user opens a directory with a .pgn (or Lichess/ chess.com export) and asks to analyze, review, or annotate a game. Covers move-by-move evaluation, opening, critical moments, accuracy, time management, and the emotional/momentum arc of the game. Triggers on: "analyze this game", "разбери партию", "review my chess game", "PGN analysis", a .pgn in the cwd.

Chess Analyst

Turns an engine-annotated PGN into the kind of write-up a strong coach or a top-event commentator would give: a narrative, not a wall of numbers.

Core principle

The engine has already done the math. A PGN exported from Lichess (or chess.com with analysis) embeds everything we need in the move comments:

[%eval 0.18] — evaluation after the move (White's perspective; #-1 = mate)
[%clk 0:14:13] — clock after the move
inline judgements — Nxh7?? (0.27 → -3.40) Blunder. c3 was best.
the refutation line in (...) parentheses

So do not reach for Stockfish when the PGN is already annotated. Your job is the layer the engine can't do: a coherent human story built on those numbers, plus time and momentum reading.

Workflow

Intake — ask before analyzing. Look in the working directory for *.pgn (also *.txt containing PGN — Lichess sometimes exports that way), then settle two things with the user (use AskUserQuestion):
- Which file? If more than one PGN is present, list them and ask which to analyze (offer "all of them" as an option). With exactly one, skip the question and just use it.
- Whose game is it? Ask for the player's nickname / name (or "White" / "Black"). Match it against the White/Black headers to fix the point of view: the whole report is written for that player — their mistakes, their missed wins, their lessons — with the opponent treated as the opposition. If the user doesn't care, fall back to a neutral two-sided write-up.
Run the analyzer (stdlib only, no install needed). Pass --focus with the player from intake so the coach-level metrics are computed from their side:
```
python3 scripts/analyze.py <game.pgn> --focus "<player|white|black>" --json /tmp/chess.json
```
It returns: headers, per-move eval/clock/time-spent/accuracy, momentum series, judgement counts, ranked critical moments, per-side time summary, and — when --focus is set — an opportunities block from the focus player's point of view:
- missed_wins — positions you were clearly winning and let slip with your own move,
- declined_gifts — opponent blunders your reply failed to punish,
- best_moves — your strongest moves (to credit, not just criticize).
- If engine_annotated is false, the PGN has no evals. Tell the user, and offer the Stockfish fallback (see below) rather than inventing evaluations.
(Optional) Render key diagrams. Only if the user wants board images and python-chess is available:
```
pip install chess        # one-time, optional
python3 scripts/render_boards.py <game.pgn> --plies <list> --out diagrams/
```
Pick plies from the top critical moments. Skip silently if the lib is absent — diagrams are a nice-to-have, not the point.
Write the report following templates/report.md. Save it next to the PGN as <game>_analysis.md. Put a  anchor before each critical-moment header (its ply from critical_moments) — harmless in the .md, and it lets the next step wire that heading to the board. Always glyph the move in each critical-moment heading (?? blunder, ? mistake, ?! inaccuracy, !? practical, ! good, !! brilliant) — the engine only flags errors, so add the glyph yourself for strong/practical moves. That glyph drives a colored at-a-glance verdict chip in the HTML, so the reader sees what kind of move it was without studying the position.
Build the interactive HTML artifact — do this by default, don't ask. The visual counterpart to the report and the main deliverable people actually open: one self-contained .html (no CDN, opens offline) with a clickable eval graph, an embedded board at every critical moment — each steps through what was played (red arrow) and toggles to the engine's best_line (green arrow), "here's how it should have gone" — each board carries file/rank coordinates and each moment a color-coded verdict chip (blunder / good / practical, from the heading glyph) — and a glued side quiz that auto-picks the 2-3 sharpest mistakes and asks "what would you play?", then reveals the move actually made plus a mini-explanation. Always produce it; only skip if the user says no.
```
python3 scripts/build_artifact.py <game.pgn> \
    --json /tmp/chess.json --md <game>_analysis.md --out <game>_analysis.html
```
Add --flip when the focus player has the black pieces — it orients every board with that player's side at the bottom, the way they actually saw the game. (No flag = White at the bottom, the default.)

Pass --lang to match the report's language so the HTML chrome (buttons, verdict chips, quiz) speaks the same language as your prose — --lang en for an English report, ru is the default. Built-in: ru, en (see the LANGS table in build_artifact.py to add more). Write the report's section headings in that same language too.

Needs python-chess — it turns SAN into the board positions the stdlib analyzer never tracks. If it's genuinely not installed, try pip install chess first; if that fails, the script exits cleanly and the Markdown report stays the deliverable. Because each critical moment renders its own inline board, put a  anchor before every critical-moment ### header (step 4) — that's what places a board there.

What makes the write-up "adult-level"

Lean on the data, never hand-wave. Concretely:

Narrative arc, not a move list. Tell the story of the game in phases: opening plan, where it tipped, the decisive sequence, the finish.
Critical moments get real treatment — what was threatened, what the engine's best_move/best_line was, why the played move fails, in words.
Read the intention behind every mistake. A bad move is almost never random — the player wanted something. Name the plan and how it backfired: "you played Nxh7 going for the classic bishop sac, but the h-file never opened and you were just down a piece." This "wanted X, got Y" framing is the single most coach-like thing in the report — do it for each real mistake, inferring the idea from the position and the line, not from numbers alone.
Objective best ≠ what a human must find. When the engine's top move is inhuman (a quiet computer-only resource) and the played move was reasonable, say so — separate "this lost" from "this was a forgivable practical choice." Don't hold the player to a standard no human meets.
Missed wins and declined gifts (from opportunities) deserve their own beat: positions you were winning and let go (missed_wins), and opponent blunders you didn't punish (declined_gifts) — these sting more than a plain blunder count and are exactly what a coach circles.
Credit what worked. Use best_moves to praise the player's genuinely strong moves. A report that's all errors demoralizes; a coach builds on strengths too.
Time management: cross-reference time_spent_s with blunders. A long think that still produced a mistake, or a fast move in a critical position, is a real finding — cite the seconds. (In the sample game, both sides' single longest thinks were both blunders — that is the kind of observation to surface.)
Emotional / momentum arc: read the momentum series. Repeated swings, a winning position thrown away, mutual blunders trading the advantage — describe the psychological shape of the game, but ALWAYS anchored to eval+clock numbers, never as free-floating "he got nervous."
Accuracy & error tally per side, with the opening named from the ECO header.
Error breakdown & how to fix it — for the player in focus, don't just count mistakes: group them by type (tactical oversight, opening prep, time trouble, converting a winning position, prophylaxis…) using the judgements and the best_line the engine gives. For each recurring pattern, name a concrete training takeaway the player can act on next time — anchored to specific moves from this game, not generic advice.

Tone

Authoritative, specific, readable — the register of a strong-event broadcast or a serious post-game coaching session. No filler. Praise good moves where the eval supports it; be honest about mistakes without being snide.

Report language: write in the language the user is speaking to you in; default to Russian if it's ambiguous. Don't ask about it at intake — infer it. Only switch if the user explicitly requests another language.

Stockfish fallback (only when PGN has no [%eval])

If engine_annotated is false and the user still wants a full analysis, you may shell out to a local stockfish binary to evaluate each position. This is the exception, not the default path. Confirm with the user before installing or running anything heavy.