name: review-upgrade-compatibility-and-test-determinism description: Use when reviewing pull requests for API, schema, protocol, upgrade/downgrade compatibility, migration safety, and test determinism risks before merge.
Review Compatibility, Upgrade Safety, and Test Determinism
Overview
Use this skill to identify compatibility and test-determinism risks in behavior-changing pull requests. Focus on backward/forward compatibility, rolling upgrade and rollback safety, migration correctness, and whether tests provide deterministic evidence for changed behavior.
This skill covers compatibility and test-determinism discovery only. After findings are identified, hand off output rendering to review-output-format.
This is a static-analysis-first review flow. Do not run build/test commands directly as part of this skill.
Input
This skill accepts exactly three input parameters:
code_path(required): absolute path to the cloned codebase; this is the working directory for review.diff_filename(required): filename of the diff to review.output_filename(required): final output JSON filename for the rendered findings payload.
Output
- Final output must be a file named exactly as
output_filename. - File content format is already defined by review-output-format.
- Do not invent or emit an alternate output schema.
Review Flow
Scope the compatibility surface
- Use
code_pathas the working directory. - Load and review changes from
diff_filenameas the primary review scope. - Identify changed compatibility surfaces: API contracts, protocol payloads, serialization formats, storage/on-disk formats, config keys, migration scripts, and release/runbook docs.
- Identify changed validation surfaces: unit, integration, e2e, upgrade/downgrade, migration, failure-path, and determinism tests.
- Use
Run full checklist pass
- Use every section in references/compatibility-and-test-determinism-checklist.md.
- Do not skip sections; mark
N/Aonly with a concrete, diff-tied reason.
Evaluate potential issues with objective evidence
- Confirm at least one objective signal:
- unversioned contract or protocol break
- backward compatibility gap for mixed-version clusters without explicit guard
- bootstrap or versioned metadata mutation that is not append-only and lacks proven migration safety
- rolling upgrade/downgrade behavior change without explicit fallback or version gating
- migration/rollback path missing safeguards
- migration path is not idempotent/retry-safe under partial failure
- incompatibility path silently falls back instead of returning an actionable error
- mixed-version behavior not covered by tests or explicit guarantees
- defaults/unknown-field handling that breaks forward/backward compatibility
- deterministic-test anti-patterns (sleep-based sync, wall-clock dependence, unstable ordering)
- test changes mix multiple scenarios into one case, reducing failure isolation
- missing failure-path/concurrency coverage for changed logic
- If evidence is weak, request targeted validation instead of asserting a defect.
- Confirm at least one objective signal:
Assign severity for handoff
Blocker: high-confidence incompatibility or determinism failure likely to break upgrade/rollback safety, data integrity, or production correctness.Major: high-confidence compatibility or test-quality risk with meaningful user or operational impact.Minor: localized compatibility or determinism gap with limited blast radius.Info/Nit: low-risk observation, clarification, or polish.
Prepare findings payload
- One issue per finding.
- For each finding include:
severitytitlewhyscoperisk_if_unchangedevidencechange_request
- Keep scope concrete (
path:line, module, branch, or scenario boundary).
Required output handoff
- After the checklist pass is complete, invoke review-output-format.
- Render final findings strictly with that skill's output contract.
- Write that JSON to
output_filename.
Runtime Validation Policy
- Never execute project build commands during this review flow.
- Never execute project test commands during this review flow.
- Never run broad verification tasks directly in this skill.
- If runtime confirmation is needed, request targeted validation with a minimal scenario and exact command, but do not run it yourself.
- Keep review latency low by relying on diff evidence, compatibility reasoning, and deterministic-test design checks.
Review Depth Rules
- Do not emit a no-findings conclusion until every checklist section is passed or explicitly
N/A. - Treat unversioned breaking changes across API/protocol/storage formats as high risk by default.
- Treat migration/rollback assumptions as insufficient unless downgrade behavior and restart safety are explicit.
- Treat mixed-version and upgrade sequencing gaps as high-risk operational defects for distributed systems.
- Treat flaky-test patterns in changed critical paths as reliability risks, not style-only issues.
- Prefer concrete validation requests when confidence is medium or low.
References
- Compatibility and determinism checklist:
- Output rendering contract: