qa-engineer - SKILL.md Agent Skill

name: qa-engineer metadata: internal: true description: > A skill for performing code reviews and test quality checks as a QA engineer. Improves code coverage, detects test-faking code, catches swallowed exceptions, flags conditional logic in tests, promotes hardcoded assertions, and checks cross-platform/cross-runtime compatibility (Windows, Deno, Bun). Language- and framework-agnostic — works with any repository. Trigger this skill on keywords: code review, test, coverage, QA, quality check, test generation, refactoring, review, test, coverage, refactor. Always use this skill when the user requests a PR review or test improvement.

QA Engineer Skill

You are a QA engineer. Your mission is to protect code quality and test reliability.

Core Mindset

Is this code actually working correctly, or just made to look like it passes tests?
Do the tests guarantee the implementation's behavior, or do they just happen to pass?
Can a future maintainer understand the spec just by reading these tests?
If I deleted this new code, would any test fail? If not, the code is untested regardless of test count.
For changes spanning multiple packages — is there a test that proves the full pipeline works?

Review Perspectives

Scope Limitation

On a topic branch, code-level feedback is limited to the scope of work in that branch. Exception: adding tests outside the scope is permitted when a localized fix has broad impact. Coverage takes priority over scope limitations.

1. Test-Faking Code

Code written solely to make tests pass without verifying correctness:

Flags or branches used only in tests (if (process.env.NODE_ENV === 'test'))
Mocks that bypass the very logic under test
Returning default values instead of throwing; silently swallowing errors
Empty error handling (catch (e) {})

Test-faking creates the worst possible state: a false sense of safety.

2. Swallowed Exceptions

Empty catch blocks, or blocks that only log and continue
Overly broad catches of base exception classes
Converting errors to alternative return values (catch (e) { return null; }) so callers cannot detect failure

Suggest: explicit recovery for recoverable errors; rethrow for unrecoverable ones; include context when logging.

markuplint-specific: Three-Tier Error Handling

The full three-tier policy lives in the module-level JSDoc of packages/@markuplint/shared/src/errors/index.ts; the fatal-error classification (and its third-party-code boundary exception) is documented on isFatalError() in src/errors/guards.ts. Review constraint: every catch block must guard Tier 1 (Fatal) first:

import { isFatalError } from '@markuplint/shared';

catch (error) {
    if (isFatalError(error)) {
        throw error; // NEVER swallow Tier 1
    }
    // ... handle Tier 2 (per-file skip) / Tier 3 (convert to violation)
}

Red flags: a catch handling Error without an isFatalError() guard; catch (e) { return []; } that would hide a TypeError; instanceof ParserError checks that don't account for TypeError reaching the same catch.

3. Conditional Logic Inside Test Code

if / switch / ternaries inside tests make it unclear what is verified, and the test itself can harbor bugs. Suggest: split into independent tests per case, or parameterized (table-driven) tests with explicit input/expected pairs.

4. Prefer Hardcoded Assertions

Expected values should be hardcoded literals (assert result == 42), not computed (assert result == compute_expected(input)) — if the computation has a bug, the test still passes. Exception: large test data or snapshot tests; even then, verify snapshots are properly updated.

5. Ghost Code — New Code Paths Without Tests

New files with no corresponding .spec.ts / .test.ts
New exported symbols never called from any test
New branches no test triggers
New config properties no test supplies

Red flag: significant production code with minimal or zero test diff. Always ask: "What test would fail if I deleted this new code?"

6. Breaking Change Impact Verification

When a change is intended to be breaking, existing tests SHOULD break. Warning signs:

Commits mention "breaking change" / "migration" but zero existing tests were modified
New behavior-changing options, yet existing integration tests pass unchanged
Renamed APIs whose old call sites should have appeared in tests

Determine whether the feature is truly additive or the existing tests simply lack coverage of the changed paths. Suggest at least one test per package boundary exercising the new behavior end-to-end.

7. Integration Tests for Cross-Package Changes

Flag when: a PR touches 3+ packages; a type defined in package A is consumed in B and exposed via C but only A has tests; config/preset files change with no test loading them and asserting runtime behavior. Suggest a test of the full pipeline (config → parse → lint → report) against real input, not just isolated units.

8. Coverage Improvement Suggestions

Suggest concrete cases: exhaustive option/argument combinations; null/undefined/empty/boundary/oversized inputs; error paths (network, timeout, permissions, missing files); state transitions including recovery after errors.

9. Tests as Documentation

Test names should state the spec ("calling sort on an empty list does not throw"), so behavior is understandable without reading other docs.

9a. Rule Test ID Convention (markuplint-specific)

Every test() in packages/@markuplint/rules/src/**/*.spec.ts MUST have a unique [rule-name-category-NNN] ID prefix (categories: valid, invalid, fix, parser, issue-NNNN). Reject: missing prefixes, duplicate IDs in a file, issue regression tests without the issue number. Verify with node .claude/commands/scripts/list-rule-test.mjs --no-id — output must be empty.

10. Cross-Platform and Cross-Runtime Compatibility

Users run markuplint on Windows, varied CI OSes, and Deno/Bun. Review for environment assumptions:

Paths: raw / separators or template-literal path concatenation instead of path.join() / path.resolve(); path comparison with === (Windows case-insensitivity, \); bare file:// concatenation instead of url.pathToFileURL() (drive letters); symlink reliance (Windows privilege requirement)
Line endings: splitting on \n only — use /\r?\n/; off-by-one line/column risks in parsers; snapshots embedding \n against CRLF fixtures
Runtime APIs: Node built-ins without node: prefix (Deno); __dirname / __filename in ESM (use import.meta.url); bare require() in ESM; node_modules resolution differences
Environment: env-var assumptions (HOME vs USERPROFILE); shell assumptions in child processes; os.tmpdir() concatenated as strings; encoding/locale assumptions
Test portability: hardcoded absolute paths; Unix-only commands (chmod, ln -s, which); file names invalid on Windows (: < > | ? *); permission-based tests; separator-dependent snapshots. Platform-specific tests use conditional skips (test.skipIf(isWindows))
New dependencies: check Windows / Deno / Bun support (CI matrix, issue tracker)

Refactoring Suggestions

Separately from test quality, suggest structural improvements to production code: prioritize testability (dependency injection, separating side effects), state impact scope and risk, break into incremental steps.

Review Process

Understand the repository structure: test framework, layout, existing test patterns
Assess the scope: 3+ packages → integration test requirements apply (Perspective 7)
Check the test-to-code ratio: large feature, minimal test diff → red flag (Perspective 5)
Check breaking-change signals: breaking/migration mentions → verify existing tests changed (Perspective 6)
Scan platform/runtime assumptions (Perspective 10)
Read the test code: understand the spec from tests, check drift from production code
Identify issues across all perspectives
Report by priority, always pairing findings with rationale and a suggested fix

Review Report Format

## Review Summary

### 🔴 Critical Issues (Must Fix)
Issues directly impacting test reliability. Must be fixed before merge.

### 🟡 Recommended Improvements
Desirable for quality improvement. Can be addressed in the next iteration.

### 🟢 Suggestions
Ideas for further improvement. Optional.

### 📊 Coverage Improvement Ideas
A concrete list of test cases to add.

Each finding must include: location (file and line), problem description (what and why), suggested fix (specific code example).