name: code-review description: Perform a project-wide code review of the 5thPlanet SEGA Saturn emulator, covering correctness, emulation accuracy, code quality, tests, documentation, and style.
When performing a project-wide code review, always follow these steps:
Survey recent changes — Run
git log --oneline -20and skim the corresponding diffs to understand the scope of work before examining individual files.Security & safety — Apply the
security-auditskill. Particular attention:- The workspace lint
unsafe_code = "forbid"(in rootCargo.toml) must remain in place. Any new#![allow(unsafe_code)]requires justification. - Bus-trait implementations are the trust boundary with host memory — confirm address arithmetic uses
wrapping_*or explicit bounds, never raw indexing that could panic on hostile addresses.
- The workspace lint
Emulation accuracy — The project's design axis is accuracy over performance. Review for:
- SH-2 ISA fidelity: Every new opcode handler must cite the SH-1/SH-2 Software Manual section it implements, and its semantics (registers touched, SR.T effect, exception conditions) must match the manual exactly.
- Cycle counts: Returned cycle totals must match Appendix A of the software manual (or
SH7604 Hardware Manual§3 for pipeline interlocks). Hand-wavy "approximately N" cycles is a regression risk. - Delay slots: Confirm that branch instructions never execute their slot before the slot fetch, and that
Op::is_illegal_in_slot()is consulted for new branch/jump/SR-modifying ops. - PC-relative addressing: SH-2 uses
PC_of_instr + 4as the base, not the runningregs.pc. Verify new PC-relative ops use theinstr_pcargument plumbed throughexecute(). - Endianness: SH-2 is big-endian. All multi-byte bus accesses must use
from_be_bytes/to_be_bytes.
Correctness and logic — Review for:
unwrap()/expect()in non-test code where a meaningful error or exception could be raised instead.- Integer overflow in cycle accumulation;
Pipeline::cyclesusessaturating_add, new accumulators should too. - Sign-extension correctness for 8-/12-bit immediate and displacement fields.
Code smells — Flag:
- Duplicated decode/dispatch patterns that should be extracted into a helper or table entry.
- Functions exceeding ~60 lines without clear justification (large opcode-dispatch matches are an acceptable exception).
- Magic numbers — SH-2 register offsets, exception vector numbers, and FFFFFE00 on-chip base should be named constants.
- Dead code or stale commented-out blocks.
Test coverage — Verify:
- Each new opcode has at least one integration test under
crates/sh2/tests/(file per opcode family). - Each new cycle-count claim has a
pipeline_timing.rsassertion (once task #5 lands). - Decoder changes have a
crates/sh2/src/decoder.rs::testscase. - Tests construct CPUs via the
MemBusfixture (sh2::harness); ad-hoc bus mocks duplicate it without need.
Measure coverage with
cargo llvm-cov— don't eyeball it. The tool iscargo-llvm-cov(cargo subcommand) plus thellvm-tools-previewrustup component; ifcargo llvm-cov --versionfails, install withcargo install cargo-llvm-covandrustup component add llvm-tools-preview.- Collect once (instrumented build + full suite, includes the ~30s
bios_bootgolden — give it a few minutes):cargo llvm-cov --workspace --summary-only - Re-report from the cached profile data without re-running (fast; use this for every follow-up view):
cargo llvm-cov report --summary-only·cargo llvm-cov report --show-missing-lines·cargo llvm-cov report --html(writestarget/llvm-cov/html/index.html). - Confirm the diff is covered, not just the totals: cross-check
--show-missing-linesagainst the lines this change added/modified — every new or changed production line should be absent from the uncovered set. A green workspace total hides an untested new branch. - When a finding is "extract a helper" (a code smell), prefer the version that is independently unit-testable — extracting inline logic into a named
fnand unit-testing it is a net coverage win even if the original call site stays uncovered. Note this trade-off in the finding. - Read the totals in context: the interactive
jupiterfrontend (main.rs, the SDL2 loop) and debug-only modules (sh2/src/debug.rs) are ~0% by nature and drag the workspace number down; judge the core emulation crates (sh2,saturn,vdp2,scu_dsp) on their own, where 80–100% is the expectation. Don't demand coverage of the interactive frontend. - In the report (step 9), state the coverage delta for the touched files and call out any new code that landed uncovered as a Tests finding with a concrete test to add.
- Each new opcode has at least one integration test under
Documentation quality — Confirm:
- Public items in
sh2carry///doc comments naming the SH-2 manual reference where relevant. doc/roadmap.mdreflects the current milestone status (mark tasks ✅ done as soon as they land).- Non-obvious cycle-count or interlock decisions carry an inline comment citing the manual section that justifies them.
- Public items in
Code style — Confirm:
- Formatting: do NOT run
cargo fmt --all. The workspace is not globally rustfmt-clean and never has been — it uses a deliberate compact hand-style (e.g.crates/sh2/src/isa.rskeeps ~142 enum variants each on a single line). Under the pinned toolchain (edition 2024)cargo fmt --allreformats ~60 files (isa.rsalone explodes by +790 lines), andstyle_edition = "2021"is worse. There is norustfmt.tomland no CI fmt gate, so any "fmt-clean" expectation is aspirational. Format only the lines a change adds/modifies, by hand, matching the surrounding code. Flag a hunk only when its style clearly diverges from its neighbors; never recommend a workspace-wide reformat as part of a review (it's a separate, deliberate decision needing explicit sign-off). No#[rustfmt::skip]without justification. cargo clippy --workspace --all-targets -- -D warningsis clean (this is enforceable and should pass). Any#[allow(clippy::...)]suppression must have a comment explaining why the lint is a false positive here.- No
println!/eprintln!in thesh2library crate (it'sno_std+alloconly).
- Formatting: do NOT run
Report findings — Present all identified issues grouped by category: Safety, Accuracy, Correctness, Code Smell, Tests, Documentation, Style. Assign each a severity of Critical, High, Medium, or Low. For every finding, include the file path and line number, a clear description, and a concrete recommendation for how to fix it.