audit-review - SKILL.md Agent Skill

name: audit-review description: Perform deep feature audits with transition-matrix and logical fault-injection validation. Use when reviewing complex changes, regressions, state-machine behavior, config interactions, API/protocol flows, and concurrency-sensitive logic.

Audit Review

Purpose

Run a repeatable deep audit for any feature and report confirmed defects with severity. Default mode is static reasoning unless runtime execution is explicitly performed.

Workflow

If PR scope is large, partition by functionality/workstream first:
- define partitions and boundaries,
- review each partition independently with the full workflow below,
- track per-partition findings and coverage,
- deduplicate cross-partition findings by root cause,
- finish with cross-partition interaction risks.
Build call graph first:
- user/system entrypoints (API, RPC, CLI, worker, scheduler)
- dispatch and validation layers
- state/storage/cache interactions
- downstream integrations (network, filesystem, service calls)
- exception and error-propagation paths
Build transition matrix:
- request/event entry -> processing stages -> state changes -> outputs/side effects
- define key invariants and annotate where each transition must preserve them
Perform logical testing of all code paths:
- enumerate all reachable branches in changed logic,
- record expected branch outcomes (success, handled failure, fail-open/fail-closed, exception),
- include happy path, malformed input, integration timeout/failure, and concurrency/timing branches.
Define logical fault categories from the code under review:
- derive categories from actual components, transitions, and dependencies in scope,
- document category boundary and affected states/transitions,
- prioritize categories by risk and blast radius.
Run logical fault injection category-by-category:
- execute one category at a time,
- for each category cover success/failure/edge/concurrency paths as applicable,
- record pass/fail-open/fail-closed/exception behavior per injected fault.
- maintain a category completion matrix with status:
  - Executed / Not Applicable / Deferred,
  - outcome,
  - defects found,
  - justification for Not Applicable or Deferred.
Confirm each finding with code-path evidence.
Produce coverage accounting:
- reviewed vs unreviewed call-graph nodes,
- reviewed vs unreviewed transitions,
- executed vs skipped fault categories (with reasons).
- mark coverage complete only when every in-scope node/transition/category is reviewed or explicitly skipped with justification.
For multithreaded/shared-state paths, perform interleaving analysis:
- write several plausible thread interleavings per critical transition,
- identify race/deadlock/lifetime hazards per interleaving.
For mutation-heavy paths, perform rollback/partial-update analysis:

reason about exception/cancellation at intermediate points,
verify state invariants still hold.

C++ Bug-Type Coverage (Required for C++ audits)

memory lifetime defects (use-after-free/use-after-move/dangling refs)
iterator/reference invalidation
data races and lock-order/deadlock risks
exception-safety and partial-update rollback hazards
integer overflow/underflow and signedness conversion bugs
ownership/resource leaks (RAII violations)
undefined behavior from invalid casts/aliasing/lifetime misuse

Multithreaded Database Emphasis

For ClickHouse-style multithreaded systems, prioritize these checks before lower-risk issues:

Shared mutable state touched by multiple threads without clear synchronization.
Lock hierarchy consistency and potential lock-order inversion/deadlock cycles.
Cross-thread lifetime safety (dangling references/pointers after erase/reload/shutdown).
Concurrent container mutation + iterator/reference use.
Exception/cancellation paths that can leave locks/state inconsistent.

Output Contract (Required)

Always perform the full deep analysis workflow above, but keep the final user-visible report short and limited to:

Confirmed defects
Coverage summary

AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Audit update for PR #<id> (<short title/scope>):

Confirmed defects:

    <Severity>: <short defect title>
        Impact: <concrete user/system impact>
        Anchor: <file> / <function or code path>
        Trigger: <smallest realistic trigger condition>
        Why defect: <1-2 lines, behavior not preference>
        Fix direction (short): <1 line>
        Regression test direction (short): <1 line>

<repeat defects, sorted High -> Medium -> Low>

Coverage summary:

    Scope reviewed: <one line>
    Categories failed: <short list>
    Categories passed: <short list or count>
    Assumptions/limits: <one line>

If no confirmed defects:

output No confirmed defects in reviewed scope.
still include Coverage summary.

Short-form constraints (required)

Keep each defect compact and actionable.
Include only confirmed defects.
Use snippets only when needed to prove a defect, or when the user asks.
Do not include full workflow narrative sections in the report.

Severity Rubric

High: realistic trigger can cause crash/UB/data corruption/auth bypass/deadlock.
Medium: correctness/reliability issue with narrower trigger conditions.
Low: diagnostics/consistency issues without direct correctness break.

Checklist

Verify call graph is explicitly documented before defect analysis.
Verify invariants are explicitly listed and checked against transitions.
Verify fail-open vs fail-closed behavior where security-sensitive.
Verify logical branch coverage for all changed code paths.
Verify fault categories are explicitly defined from the reviewed code before injection starts.
Verify category-by-category execution and reporting completeness.
Verify full fault-category completion matrix is present and complete.
Verify concurrency and cache/state transition paths.
Verify multithreaded interleavings are explicitly analyzed for critical shared-state paths.
Verify rollback/partial-update safety under exception/cancellation points.
Verify major C++ bug classes are explicitly covered (or marked not applicable).
Verify race/deadlock/crash class defects are prioritized and explicitly reported.
Verify error-contract consistency across equivalent fault paths.
Verify performance/resource failure classes were considered.
Verify findings are deduplicated by root cause.
Verify coverage accounting is present (covered vs skipped with reason).
Verify stop-condition criteria for coverage completion are explicitly satisfied.
Verify every confirmed defect includes code evidence snippets.
Verify parser/config/runtime consistency.
Verify protocol/API parity across entrypoints.
Verify no sensitive-data leakage in logs/errors.