pr-atom-reviewer - SKILL.md Agent Skill

name: pr-atom-reviewer description: Review a local git branch as a pull request with a bias toward minimum disruption and minimum reviewer scope. Splits sprawling PRs into independently-mergeable atoms of work — each one a single end-to-end behaviour describable in 2-3 sentences, each shippable to the trunk on its own without any other atom in the plan being merged first — and demands a screenshot/video/recording proving the verifiable acceptance criterion. If the AllSource Prime MCP server is available (`prime_*` tools), the skill recalls prior reviews of the same repo to calibrate its judgement and records the current review for the next one. Use this skill whenever the user asks to review a PR, review a branch, prep a PR for review, check if a branch is ready to merge, or mentions "this PR is too big", "split this PR", "atomic commits", "scope creep in PR", "PR review checklist", or anything about getting a branch in shape before peers look at it. Trigger even if the user just says "look at my branch" in a code review context.

PR Atom Reviewer

Review a local git branch the way a senior engineer who hates wasted reviewer time would. The goal: make sure what goes up for peer review is the smallest reviewable unit that demonstrably ships one end-to-end behaviour, with visual proof attached.

This skill assumes the PR lives on a local git branch the user has checked out. If they paste a URL or diff instead, ask them to check the branch out locally first — git commands are how this skill sees the world.

This skill is designed for Claude Code in the terminal: it shells out to git, optionally to gh (GitHub CLI) for PR description and attachments, and optionally to the AllSource Prime MCP server for memory across reviews (see the "Memory" steps). If Prime tools are unavailable, the skill still works — it just won't learn over time. If gh is unavailable, the skill asks the user to paste the PR description if needed.

Core principles (why this skill exists)

Reviewer attention is the scarce resource, not author time. A 2000-line PR doesn't cost the author 10× a 200-line PR — but it costs reviewers 10× to do honestly, so they don't, and bugs ship. Splitting is an act of respect.
An atom is one end-to-end behaviour, not one file or one commit. "Add button" + "wire button to handler" + "handler calls API" is one atom (the button works end-to-end), not three. Conversely, "refactor auth module AND add password reset" is two atoms even if it's one commit.
Atoms target trunk, never another feature branch. This is non-negotiable. Every atom PR in a split plan must merge directly into the project's trunk (main / master / develop) — never into another feature branch, never into a sibling atom. Stacked / chained PRs are explicitly not an atom split: reviewers still have to hold the whole shape in their head, rejections cascade, and the supposed "independence" is fictional. If you find yourself proposing branches like targets: feat/foo for the split plan, stop — that is the failure mode this principle exists to prevent. The branch under review may itself be stacked (see edge cases below), but the output of this skill — the recommended split — is always a set of trunk-targeted PRs.
When atoms share foundation code, duplicate it across PRs and let rebase clean up. It's common for two atoms to look dependent because they share a helper, a type, a schema, or a module-level wire-up. The right answer is not to stack them (that violates principle 3) and not to put the foundation in atom 1 and have atom 2 wait. The right answer is for each PR to carry its own copy of the shared foundation it needs to compile against trunk on its own. Git's rebase semantics handle the rest — when the first PR merges, the duplicate code lands on trunk; when the next PR rebases, the duplicate resolves automatically (identical content, no conflict). Plan with this rebase outcome in mind from the start: the duplication is honest, it makes each PR self-contained for review, and rebase tidies up at merge time. Reviewer load goes down, not up — each peer sees one focused atom, not a stack.
If you can't show it working, it isn't done. A screenshot, screen recording, or terminal capture proving the behaviour. No proof → not reviewable. This rule has no exceptions for "trivial" changes; trivial changes are trivial to record.
Drive-by changes are scope leaks. Renaming a variable in a file you happened to touch, fixing an unrelated lint warning, "while I'm here" tweaks — these belong in their own atom or not at all. They are the single biggest source of review fatigue.

Hold these as the bar. Be direct about violations. Vagueness here costs the user reviewer goodwill.

Process

1. Orient: what branch, against what base

Run, in order:

git rev-parse --abbrev-ref HEAD                    # current branch
git status --short                                  # uncommitted work?
git log --oneline -20                               # recent commits

Ask the user (or infer from common conventions like main / master / develop) what the base branch is. Then:

BASE=<base-branch>
git fetch origin "$BASE" --quiet 2>/dev/null || true
git log --oneline "origin/$BASE..HEAD"              # commits in this PR
git diff --stat "origin/$BASE...HEAD"               # file-level summary
git diff --shortstat "origin/$BASE...HEAD"          # total +/- lines

If origin/$BASE doesn't resolve, fall back to $BASE (local). If the user has uncommitted work, surface it — they probably want that included in the analysis. Do not proceed past this step if the base branch is genuinely unclear; ask.

1.5 Recall: what does this repo's review history say?

Only run this step if Prime MCP tools are available (look for tool names starting with prime_ in your available tools). If they aren't, skip to step 2 silently — don't tell the user "memory unavailable" unless they ask.

If Prime is available, do this before analysing the diff so the prior context informs your reading:

Get a repo identifier — use the origin remote URL (git config --get remote.origin.url) or the working-directory basename if there's no origin. Treat this as the repo key for all Prime operations.
Build a short query string from the diff: the branch name, the touched top-level directories, and one or two of the most-changed files. Example: "branch: add-notifications | dirs: web, api, migrations | files: NotificationService.ts, 002_notifications.sql".
Call prime_recall with that query string. Ask for the top 5 results. The tool's exact parameter names depend on the Prime version — if you don't know them, list the tool's input schema first with the standard MCP introspection, then call it. Do NOT guess parameter names; use what the schema says.
Also call prime_neighbors (or prime_search) on the repo node with type pr_review to pull the last few reviews of this same repo even if they don't match the query.
Read what comes back. Look for:
- Recurring scope leaks in this repo (e.g. "every PR here drags in formatting changes from web/components/ui/")
- Atom techniques that have worked here before (e.g. "this team uses flag-off-merged-first for any user-visible change")
- Conventions the diff appears to violate but past reviews have accepted (e.g. "generated files in proto-out/ always look huge; ignore")
- Past mistakes to avoid repeating (e.g. "review #34 called this an independent atom; it broke prod — apply technique 2 more skeptically here")
Carry forward only what's relevant. Don't dump prior-review summaries into the verdict — let them silently shape the analysis. If something from the past is directly load-bearing (e.g. "in this repo we never count lockfile churn"), say so once in the verdict's Disruption summary so the user understands the calibration.

If prime_recall returns nothing (cold start), proceed normally — that's expected for the first few reviews of a repo. Don't fabricate prior context.

2. Measure the disruption

From the diff stats, note:

Total lines changed (added + deleted)
Number of files touched
Number of distinct top-level directories touched (a proxy for surface area — a PR touching src/auth/, src/billing/, and docs/ is suspicious)
Test files vs production files ratio
Generated/lockfile/vendored files — exclude these from the "reviewer attention" count but note their presence

Use these heuristics as smells, not hard limits — context matters, but the user should have to justify crossing them:

Smell	Threshold	What it usually means
Large PR	>400 lines of human-authored diff	Needs splitting unless it's a mechanical refactor
Many files	>15 non-generated files	Likely multiple concerns
Many directories	>3 top-level dirs	Almost certainly multiple atoms
Few or no tests	test-to-prod ratio under ~0.3 with new behaviour	Missing test coverage
Drive-by hits	unrelated formatting/lint changes mixed in	Scope leak

3. Identify the atoms

Read the full diff:

git diff "origin/$BASE...HEAD"

For each logically distinct change, write down:

What behaviour does this change deliver end-to-end? State it as a user-visible or system-observable outcome ("User can reset password via email link", not "Add PasswordResetController").
What files are part of this atom?
Is there visual proof that this behaviour works? Check the PR description, commit messages, and any attachments the user mentions.

Right-size test (the 2-3 sentence rule): if you cannot describe what one atom does in 2-3 sentences without using "and then" or "also", it is too big. Split it.

Calibration — what right-sized atoms look like:

Right-sized (one atom)	Too big (multiple atoms)
Add a `priority` column to `tasks` table and run the migration	"Build the priority feature"
Add a colored priority badge component to task cards	"Add priority everywhere"
Add a priority filter dropdown to the task list	"Refactor task list"
Wire the existing reset-password service into the login page	"Add authentication"
Add a Stripe webhook handler that logs incoming events (no business logic yet)	"Integrate Stripe"

Notice the right-sized examples are one observable outcome each, and the too-big ones are domains. Don't write atoms at the domain level.

Anti-pattern warning — atoms are not Ralph stories. If you've used the Ralph PRD pattern or similar, those stories are sequenced — story 2 assumes story 1 is merged, story 3 assumes 2, and so on. PR atoms are the opposite. An atom must merge to the trunk on its own, with every other atom in this plan still un-merged. The schema-then-backend-then-UI ordering Ralph uses is fine inside one atom (you can have all three in one PR), but it is wrong between atoms. If the only way story 2 makes sense is "after story 1 ships", they aren't two atoms — they're one atom that happens to have three commits.

Then apply the independence test to every candidate atom:

If I merged only this atom onto the base branch today — none of the other atoms in this PR — would the result build, pass tests, and be safe to ship to users?

If the answer is no, the atom is not independent. You have three options, in order of preference:

Reorder and isolate. Often an atom looks coupled but isn't. A pure refactor that the feature depends on can usually be lifted out and shipped on its own — it's a no-op for users, so it's safe in isolation. Identify these "carrier" changes first; they make excellent standalone atoms.
Hide behind a flag, dead code, or unused export. New code paths can be added and tested without being wired up to anything user-facing. The atom "adds the new password reset service" can ship independently if nothing calls it yet — the service exists, is tested, but is dead until a later atom wires it in. Each piece is independently mergeable because each is individually a no-op for users until the final wiring atom.
Concede they are one atom. If two pieces of work genuinely cannot be made independent — they only make sense together, neither is safe to ship alone, no flag or interface can decouple them — then they are one atom, not two. Don't fragment for the sake of it. Report this honestly rather than inventing a fake split.

Apply techniques 1 and 2 aggressively before falling back to 3. Most PRs that "have to ship together" actually don't, once someone takes 10 minutes to look for the seam.

Never produce a split plan where atom N depends on atom N-1 being merged first. That is a stacked-PR plan, not an atomic split, and it defeats the purpose of the exercise — reviewers still have to hold the whole shape in their head, and one rejection cascades.

4. Demand e2e proof tied to a verifiable criterion

For each atom, first articulate a verifiable acceptance criterion — a single sentence describing the observable outcome a reviewer can check. This is the thing the proof must show working.

Good criteria (verifiable):

"Clicking the 'Forgot password' link sends an email to the user's address within 10 seconds"
"Tasks with priority 'high' appear at the top of the list with a red badge"
"POST /api/orders with an invalid card returns 402 and does not create an order row"

Bad criteria (vague, untestable):

"Password reset works correctly"
"Priority feature is implemented"
"Improved error handling"

Then demand proof tied to that criterion. For every atom that changes observable behaviour (UI, API response, CLI output, log output, error handling — anything a human or downstream system perceives):

Is there a screenshot, screen recording, or terminal capture in the PR description that shows the verifiable criterion being satisfied?
Does the proof show both the trigger (user action / input / request) and the outcome (the new behaviour)?
For UI: the recording should show the user interaction and the resulting UI state, end-to-end. Static screenshots are weaker than a short recording for anything stateful.
For backend/CLI: a terminal screenshot or asciicast showing the command/request and the new response.
For error paths: proof must cover the error path being exercised, not just the happy path. A new validation rule needs a recording of the invalid input being rejected.

If proof is missing or weak, that's a blocker. Be explicit per atom — not "needs more proof" but "missing: recording of the password reset email arriving and the link working end-to-end, and a recording of the expired-link case showing the error message".

If the user's environment has a browser-verification skill or similar (e.g. dev-browser), suggest using it to capture the recording rather than asking them to do it manually.

Pure internal refactors with no observable behaviour change can substitute a green test suite for visual proof — but flag this and ask the user to confirm "no observable change" rather than assuming. A refactor that accidentally changes log output, error codes, or timing is no longer pure; treat any uncertainty as "needs visual proof".

5. Produce the verdict

Output exactly this structure. Don't add sections. Don't omit sections. If a section is empty, write "None." so the user knows you considered it.

# PR Review: <branch-name>

## Verdict
**<APPROVE | REQUEST CHANGES | NEEDS SPLIT>**

One-sentence rationale.

## Disruption summary
- Lines changed: +X / -Y (Z human-authored, excluding generated/lockfiles)
- Files touched: N (M non-generated)
- Directories touched: <list>
- Test-to-prod ratio: <ratio or "no tests added">
- Smells: <bulleted list, or "None">

## Atoms identified
For each atom:
### Atom <N>: <behaviour in one sentence>
- **Behaviour**: what the user/system can do after this ships that they couldn't before
- **Files**: <list>
- **Independent because**: <one sentence explaining how this atom merges to base on its own — flag, pure refactor, dead code, etc.>
- **E2E proof**: present (link/describe) / missing / weak (<what's missing>)
- **Status**: ready / needs proof / needs split out

## Scope leaks
Unrelated changes that don't belong in any atom (drive-by renames, formatting, lint fixes in untouched-by-this-PR areas, etc.). List file:line where possible. If none, write "None."

## Recommended split plan
If the PR contains more than one atom, lay out each smaller PR. **Each atom in this plan must target trunk (`main` / `master` / `develop`) and merge into it independently — no atom targets a sibling atom, no atom depends on another in the plan being merged first. List them in any sensible order (e.g., simplest first), but the order must not be a dependency order.**

1. **<atom-name>** — <one-line behaviour>
   - Branch suggestion: `<short-branch-name>`
   - Targets: `main` (or the project's trunk equivalent) — never another atom or feature branch in this plan
   - Files: <list>
   - Independent because: <one sentence: "ships behind a feature flag", "pure refactor — no behaviour change", "adds unused module — no callers yet", "carries shared foundation X — duplicate of atom-Y's foundation, resolves on rebase", etc.>
   - Duplicates (if any): <list any files this atom carries that another atom in the plan also carries; explain that rebase resolves the duplication when one PR merges and the next rebases on the updated trunk>

If any candidate atom cannot meet the independence-vs-trunk bar — even using the duplicate-with-rebase pattern (principle 4) — do not list it as a separate atom. Either merge it back into the atom it depends on, or call out in the verdict that this work cannot be split further and explain why.

## Paste-ready review comments
Comments the user can paste directly into the PR review. Use a code fence per comment so they copy cleanly. Reference file:line where applicable.

\`\`\`
<comment 1>
\`\`\`

\`\`\`
<comment 2>
\`\`\`

6. Offer the next move

After delivering the verdict, ask the user what they want to do next. Useful follow-ups to suggest:

"Want me to draft the git commands to split this into the atoms above?" (e.g., git checkout -b <atom-branch> and git cherry-pick / interactive rebase sequences)
"Want me to rewrite the PR description to call out the e2e proof clearly?"
"Want me to check a specific concern more deeply (e.g., the test coverage for atom 2)?"

Don't perform these proactively — splitting commits is destructive enough that the user should opt in.

7. Record: write this review to Prime (Loop 1)

Only run this step if Prime MCP tools are available. If not, skip silently.

After delivering the verdict (regardless of which follow-ups the user picks), persist this review so future reviews of the same repo benefit. Do this after the verdict is on screen — don't make the user wait on it.

What to record:

The repo node, if it doesn't already exist. Call prime_search or prime_neighbors first to see if a node with type repo and name = the repo key exists. Create it via prime_add_node only if absent.
The PR review node — type pr_review. Properties to include:
- repo_key (matching the repo node)
- branch (branch name)
- base (base branch)
- verdict (APPROVE / REQUEST CHANGES / NEEDS SPLIT)
- lines_added, lines_deleted, files_changed, dirs_touched (the disruption summary numbers)
- atom_count (how many atoms you identified)
- scope_leak_count
- reviewed_at (ISO 8601 timestamp; get it via date -u +"%Y-%m-%dT%H:%M:%SZ")
One node per atom — type atom. Properties:
- behaviour (the one-sentence behaviour)
- independence_technique — one of: pure_refactor, flag_off, dead_code_unused_export, storybook_only, additive_no_callers, conceded_single_atom, other (describe in a note field)
- verifiable_criterion (the one-sentence criterion from step 4)
- proof_status — one of: present, missing, weak
- files (list of file paths)
Edges:
- pr_review → repo (belongs_to)
- pr_review → atom for each atom (contains)
- If the PR has scope leaks, one pr_review → scope_leak node per leak (leaked) with the file path and a short description as properties
Embedding: call prime_embed with the full verdict markdown as text, keyed by the PR review node's ID. This is what lets prime_recall find this review next time someone reviews a similar diff.

Use prime_* tools' actual parameter names from their input schemas — don't guess. If a call fails, log the failure to the user briefly and continue; recording is best-effort, not blocking.

Important — never put proprietary code into Prime. Record metadata, file paths, and the verdict text (which is your own analysis). Do NOT put diff contents, source code, or PR descriptions into Prime nodes or embeddings unless the user explicitly says it's OK. The verdict is yours to record; the user's code is theirs.

After recording, tell the user one line: Recorded to Prime (repo: <repo_key>, review_id: <id>). — no more, no less. They'll see this skill is learning over time without it being intrusive.

Worked example: turning one sprawling PR into independent atoms

Below is the kind of split this skill should produce. Use it as a calibration anchor when you're unsure if you've gone far enough.

Original PR: "Add user notifications" — 1,400 lines, 22 files, touches migrations/, services/, api/, web/components/, web/pages/.

A bad split (sequenced, dependent — do not produce this):

Add notifications schema (must merge first)
Add notifications service (depends on 1)
Add notification bell component (depends on 2)
Wire bell into header (depends on 3)

Each "atom" needs the previous merged. That's a stacked PR, not a split.

A good split (independent atoms, each mergeable in any order):

add-notifications-schema-dormant — Adds the notifications table migration and the ORM model. No code reads or writes the table yet. Ships independently because the schema is dormant; the rest of the app is unchanged.
add-notifications-service-unused — Adds a NotificationService with create() / list() / mark_read() methods, fully unit-tested, but no caller in the app uses it yet. Ships independently because nothing imports it from production code paths; it's library code awaiting a consumer. (Depends on atom 1's schema existing in the database — if atom 1 hasn't merged, this atom's tests won't run against a real DB, so verify it against the local schema in CI and merge atom 1 first if convenient, but the code itself doesn't break the build when merged in either order — only the tests would skip.)
add-notification-bell-component-storybook-only — Adds the <NotificationBell /> React component and its Storybook entry. The component is not yet placed in any page. Ships independently because it's discoverable only via Storybook.
enable-notifications-feature-flag-off — Wires the bell into the header, the service into the API, and the schema into the service — all behind a feature flag defaulted to OFF. Ships independently because users see no change.
turn-on-notifications-flag — One-line change flipping the default to ON, plus the e2e recording showing the notification flow working end-to-end in production.

Atom 5 is where the user-visible behaviour ships. Atoms 1-4 are dead-code or flag-gated and can merge in any order without breaking anything. A reviewer can approve atom 3 without ever looking at atom 1.

Note how the proof requirement shifts: atoms 1-4 each prove their own limited behaviour (migration runs, service unit tests pass, Storybook renders, flag-off path unchanged). Atom 5 carries the full end-to-end recording.

Pre-output checklist

Before producing the verdict, verify each item. If any fails, fix the analysis — don't ship a verdict with these unresolved.

I ran git diff "origin/$BASE...HEAD" (or the appropriate fallback) and read the actual diff, not just the file list
If Prime is available, I called prime_recall in step 1.5 and considered the prior context
Every atom I listed passes the 2-3 sentence test
Every atom has a one-sentence verifiable acceptance criterion (not "works correctly")
Every atom in the split plan targets trunk (main / master / develop) and can merge into it independently of every other atom in the plan
No atom in the split plan targets another feature branch or has a "depends on atom N" relationship
If atoms share foundation code, the plan calls out which files are duplicated and notes that rebase resolves the duplication at merge time (principle 4)
Every atom that changes observable behaviour has either a present-proof finding or a specific named gap
Scope leaks are listed with file:line where possible, or explicitly "None."
Paste-ready comments reference specific file:line locations where applicable
The verdict is one of APPROVE / REQUEST CHANGES / NEEDS SPLIT — not a hedge

After the verdict is delivered, also: if Prime is available, run step 7 to record this review.

Tone for the verdict

Be direct. The user is asking for this skill because they want a real review, not validation. Phrases like "this is mostly fine but..." are exactly the hedge that lets sprawling PRs through. If the PR needs splitting, say it needs splitting. If proof is missing, name what proof.

That said: be specific, never personal. Critique the change, not the author. "This atom is missing a recording of the failure path" is useful; "this PR is sloppy" is not.

Edge cases

Mechanical refactors (renames, codemods, dependency bumps) can legitimately be large. The test for these is: can a reviewer verify correctness by reading a small sample and trusting the rest is mechanical? If yes, large is fine — but call this out explicitly in the verdict and confirm with the user that it is truly mechanical.
Generated code / lockfiles: never count against the disruption budget, but flag if they look unexpectedly large (e.g., a one-line dependency add producing a 5000-line lockfile churn might mean a transitive upgrade worth surfacing).
Stacked / chained branches under review: if the branch under review is based on another feature branch rather than trunk, diff against the immediate parent so the review doesn't double-count the parent's work — but the split plan you output MUST still target trunk for every atom. Never propose atom branches targeting the parent feature branch (per principle 3). The way to make this honest when the parent's work isn't yet on trunk: each atom PR carries the foundation pieces it needs from the parent (per principle 4 — duplicate, then trust rebase). State this plan clearly so the user knows the duplication is intentional and will resolve at merge time. If the parent's work is so foundational that every atom would need to carry most of it (i.e., the atoms aren't separable from the parent), name that honestly: this is not 4 independent PRs, it is one PR that wants to be the parent + one or two real atoms that can ship in parallel.
Hotfixes: the bar for "needs e2e proof" still applies — a hotfix without a recording of the bug-now-fixed is a hotfix that might not fix the bug. But the bar for "must be split" can relax for genuine urgency; note the debt instead of blocking.
No commits yet on the branch (just uncommitted work): treat the working tree as the diff. git diff and git diff --staged against the base.