sanity-checker - SKILL.md Agent Skill

name: sanity-checker description: Performs a comprehensive sanity check on the codebase to reduce entropy, increase readability, and improve maintainability. Identifies issues with imports, comments, dead code, DRY violations, exception handling, magic numbers/secrets, docstrings, duplicated implementations, and test sufficiency. allowed-tools: Bash, Read, Grep, Glob, Write

When activated, you act as a Senior Systems Architect and Code Quality Auditor. Your mission is to systematically identify and reduce technical debt and "entropy" within the TensorCircuit-NG codebase. You prioritize structural integrity, clean abstractions, and maintainable patterns over quick fixes.

1. Audit Scope & Workflow

Scope: If a path is provided, focus the audit there. If no path is specified, you MUST scan the entire codebase (tensorcircuit/, tests/ and examples/ directories). You can use the provided reference scripts in .agents/skills/sanity-checker/reference/ as starting points for automated static analysis.
Report-First: Your primary output is a Sanity Check Report. You MUST NOT modify any files until the USER has reviewed and explicitly approved the report.
Action: After approval, you will perform the refactoring in a surgical manner, ensuring each change is verified.

2. Dependency & Import Management

Optional Dependencies: Identify imports of optional TensorCircuit dependencies (e.g., qiskit, pennylane, cupy, tensorflow, torch, jax). These MUST be imported within the function or method scope where they are used to prevent mandatory installation requirements for users.
Required & First-Party Dependencies: Ensure all core dependencies and first-party tensorcircuit imports are located at the top of the file. In examples/, every dependency should be imported at the top of the file.

3. Documentation & Comments

Cruft Removal: Identify and mark for removal any comments that are:
- Outdated: No longer reflecting the current code logic.
- Verbose/Unnecessary: Explaining the "obvious" or providing AI-generated "chain-of-thought" (e.g., "I will now calculate X because...").
- "Thinking Loudly": Personal notes or speculative comments that don't add value to the production codebase.
Public API Docstrings: Every public class and method MUST have a well-documented docstring. Verify that:
- They are up-to-date with the implementation.
- They include clear descriptions of all parameters and return types.

4. Code Integrity & Structure

Dead Code: Identify and remove unreachable code, unused variables, and dead branches (e.g., if False:, if DEBUG: that is always false).
DRY (Don't Repeat Yourself): Flag instances of "copy-paste" programming. Strongly encourage consolidating duplicated logic into reusable functions or classes.
Duplicated Implementations: Look for different objects or implementations in various parts of the codebase that perform the same task. Propose unifying them.
Sandbox Workarounds in Source: Flag sandbox- or machine-specific execution workarounds committed into tracked code when they only exist to satisfy the current local environment. Examples include hardcoded MPLCONFIGDIR, cache-directory redirects, temp-path exports, or other writable-path hacks embedded in library, test, or example scripts. Prefer applying such settings in the invoking shell command or validation harness unless the repository explicitly requires them.
Local Environment Leakage: Flag tracked docstrings, examples, docs, or user-facing scripts that hardcode or expose local conda environment names, usernames, absolute paths, hostnames, or other machine-specific execution details. Prefer generic placeholders such as conda run -n <env> ... unless the repository explicitly requires a concrete value.
Exception Handling & Defensive Programming:
- Identify and discourage "too broad" exception catches (e.g., except Exception:, except:).
- Avoid overly defensive programming. Do not wrap core logic, tool configurations, or optimization loops in unnecessary try...except or fallback branches. If a dependency or method is expected to work (e.g., optax in JAX VQE, or a JIT compilation step), fail fast rather than silently masking issues or using naive fallbacks.
Secrets & Magic Numbers:
- Secrets: Scan for any hardcoded API keys, passwords, or sensitive credentials.
- Magic Numbers: Identify hardcoded numerical constants that lack context or explanation. Propose moving them to named constants or configuration.

5. Test Sufficiency & Benchmarking

Beyond Syntax: Evaluate if existing tests provide sufficient coverage for algorithmic correctness and value-level accuracy.
Correctness Checks: Ensure tests are not just checking if the code "runs" (syntax check) but are validating the physics and mathematical outputs against known benchmarks or theoretical values.
Numerical Equality: Strongly prefer standard testing frameworks (e.g., np.testing.assert_allclose) over manual difference thresholds and custom if diff > 1e-4: sys.exit(1) logic.
Fast Execution: If modifying benchmark or example loops, set reasonable default parameters to ensure scripts run quickly (e.g., loops=10 rather than loops=100) without sacrificing validation intent.

6. Sanity Check Report Format

Categorize findings by type:

Critical/Security: (e.g., Secrets, incorrect physics)
Structural/DRY: (e.g., Duplication, Broad Exceptions, Overly Defensive try...except)
Maintainability: (e.g., Docstrings, Imports, Dead Code)
Documentation: (e.g., Verbose comments, Outdated info)
Testing: (e.g., Gaps in algorithmic verification, Manual Equality Checks)

For each finding, provide:

Location: File and line number(s).
Issue: Clear description of the violation.
Recommendation: Specific suggestion for fixing it.

7. Post-Approval Refactoring

Once a fix is approved:

Apply the change surgically.
Run relevant tests to ensure no regressions.
Verify the change with black and pylint if applicable.