codex-review-loop - SKILL.md Agent Skill

name: "codex-review-loop" description: | Adversarial PR code review using Codex CLI. Codex review (~20-50 min) -> structured findings -> HITL approval -> fix loop. metadata: author: codex-lb version: "2.0.0" argument-hint: "[--pr ] [--base ] [--uncommitted]"

Adversarial code review via Codex CLI, structured finding analysis, HITL-gated fix loop, optional re-verification.

Determine the review target and extract the base branch.

PR mode (default): Use gh pr view / gh pr list to resolve base branch and display PR metadata (title, file count, additions/deletions).
Branch mode (--base): Use specified base branch directly.
Uncommitted mode (--uncommitted): Review working tree changes.

If no argument is given, auto-detect the user's most recent open PR and confirm.

Launch the adversarial review as a background process.

Exit code	Meaning	Action
0	Success	Proceed to Phase 3
1	Codex error	Show error, offer retry or abort
127	codex not found	Guide: `npm i -g @openai/codex`

Variable	Purpose
`CODEX_REVIEW_MODEL`	Override Codex model
`CODEX_REVIEW_REASONING`	Override reasoning effort

Parse the raw Codex output into structured findings.

Schema: references/schemas/review-findings.md — field definitions, severity/category/effort enums, status lifecycle
Convention rules: .agents/skills/project-conventions/conventions.md — project coding conventions to cross-reference

Escalate to Critical: Unvalidated external input, secret exposure, injection vectors, data integrity compromise
Escalate to High: API contract change without test update, possible NoneType error, blocking I/O in async context
Downgrade to Medium: Correct functionality but violates conventions, duplicate logic, unnecessary complexity
Downgrade to Low: Pure style, unused import, missing docstring

Present all findings to the user as a summary table (ID, severity, category, file:line, title, effort).

Ask the user to choose a fix scope:

For each approved finding, execute an atomic fix-verify-commit cycle.

Fix: Read target file(s), apply the fix.
HITL Gate 2 (conditional): For Critical/High findings that modify existing logic (not just adding new code), show the current and proposed code to the user for confirmation. Medium/Low findings are auto-fixed.
Verify (all must pass):
- uvx ruff check .
- uvx ruff format --check . (auto-fix and re-check if needed)
- uv run ty check
- uv run pytest (mapped test files, or full suite if no mapping found)
Commit: fix(review): P{i} - {title}
On failure: Roll back changes, mark finding as skipped, continue to next.

After all fixes are committed, automatically re-run the Codex review to check for regressions or new issues introduced by the fixes.

Re-run Codex review (--base <branch>) against the same base.
Parse findings (Phase 3) and present to user (HITL Gate 1).
If 0 findings → loop terminates, proceed to Final Report.
If findings exist → execute Phase 4 (fix-verify-commit), then repeat from step 1.

Max iterations: 3 (initial review + 2 re-reviews). After 3 iterations, terminate the loop regardless of remaining findings and output the Final Report with unresolved items noted.
Escalation: If the same finding recurs across 2 consecutive iterations, mark it as wont_fix and skip in subsequent iterations.

At each re-review result, the user may choose:

Output a summary including:

PR metadata (number, title, base/head)
Loop iteration count and termination reason (clean / max iterations / user stop)
Findings table (ID, severity, category, title, status, commit hash)
Commit stack
Verification status (ruff, ty, pytest)