name: grove-run description: > Run Grove tests and diagnose failures. Use when the user asks to "run the tests", "run my test", "debug this test failure", "why is this test failing", "check if tests pass", or wants to execute and troubleshoot code example tests.
Grove: Run and Debug Tests
Begin your first response with: [grove-run-647f2a26]
Run code example tests and provide actionable diagnosis of any failures.
Do NOT use when:
- The user wants to create a new example or test → use
/grove-createor/grove-test - The user wants to convert untested code into Grove → use
/grove-migrate - The user needs to set up their environment first (no
.envfile, deps not installed) → use/grove-setup - The user wants to audit test coverage or upgrade dependencies → use
/grove-maintain
Step 0: Check for Extension Handoff
Before Step 1, check whether the Grove VS Code extension has dropped a
handoff file at .claude/grove-handoff.json in the workspace root. If the
file exists and skill equals grove-run:
Delete the handoff file immediately after reading it. The file is single-use; leaving it in place causes the next invocation to re-trigger on stale context.
Check the payload before using it. If a check fails, surface what's wrong to the writer, recommend they file an issue for the Grove VS Code extension, and ask whether to proceed without the handoff. If they confirm, fall through to Step 1.
- Version check:
versionmust equal1. A higher number means the extension is ahead of this skill. Example message:Found a Grove extension handoff with
version: {n}, but this skill only understands version 1. The extension is likely newer than the skill. Proceed without the handoff? - Shape check: the file must be valid JSON and contain the
top-level fields
version,skill,trigger,context. Thecontextobject must contain the fields listed in the schema for the matching trigger below. Example message:The Grove extension handoff at
.claude/grove-handoff.jsonis malformed: {brief reason, e.g. "missing context.testFile"}. This is likely an extension bug — please file an issue. Proceed without the handoff?
The JSON block below is the expected payload schema, not just an illustrative sample. Use it as the reference for what fields must be present:
{ "version": 1, "skill": "grove-run", "trigger": "test-failure", "timestamp": "ISO-8601", "workspaceRoot": "/absolute/path", "context": { "testFile": "relative/path/to/file.test.ts", "testName": "describe > it text", "testNamePattern": "describe it text", "line": 42, "errorMessage": "captured failure output (may be truncated)", "duration": 1234, "projectPath": "code-example-tests/javascript/driver" } }- Version check:
Treat the
contextfields as pre-filled answers — skip Step 1 (scope/language) and Step 2 (suite lookup).projectPathis the driver directory;testFileandtestNamePatternare the inputs for Step 3.Guard against stale handoffs: verify
context.testFilestill exists on disk (the writer may have renamed it or switched branches between the CodeLens click and skill invocation). If it doesn't exist, report what was expected, fall through to Step 1, and let the writer disambiguate.Echo one line confirming what was received, e.g.:
Got handoff from extension: re-running "{testName}" in {projectPath}.Proceed to Step 3 to re-run the single failing test (fresh output is more reliable than the captured
errorMessage), then Step 4 to diagnose. If the re-run now passes, report that and stop.
If the handoff file is absent or skill doesn't match, proceed normally
from Step 1.
Step 1: Determine Scope and Language
Parse the user's request to identify:
- What to run: A specific test file, a topic directory, or all tests
- Language: Infer from the file path, or ask if ambiguous
If the user provides a path, use it directly. If they say "run the tests" with no qualifier, ask which language/suite or default to running all tests in the language directory they're currently working in.
Step 2: Locate the Test Suite
Map the language to its driver directory under code-example-tests/:
| Suite | Directory | Test command |
|---|---|---|
| JavaScript | javascript/driver/ |
npm test (all) or npm test -- -t '{name}' (single) |
| Python | python/pymongo/ |
python -m pytest or python -m unittest |
| Go | go/driver/ |
go test ./tests/... |
| Java | java/driver-sync/ |
mvn test |
| C# | csharp/driver/ |
dotnet test |
| Mongosh | command-line/mongosh/ |
npm test (all) or npm test -- -t '{name}' (single) |
Read the language's CLAUDE.md for the exact commands and any env setup needed.
Step 3: Run the Tests
This is the canonical source for test commands. Other grove skills reference these patterns — keep them up to date here.
For JavaScript and Mongosh (Jest-based suites):
All tests:
cd code-example-tests/{driver-dir} && npm test
Single test by name:
cd code-example-tests/{driver-dir} && npm test -- -t '{describe or it text}'
npm test handles .env loading and Jest flags (--runInBand,
--detectOpenHandles) via the scripts.test entry in package.json. Do not
call npx jest directly — use npm test -- to pass arguments through.
Capture the full output including any error messages, stack traces, and test result summary.
Step 4: Parse Results
If all tests pass, report the summary and stop.
If any tests fail, for each failure:
- Extract the failure details: test name, error message, stack trace, expected vs. actual values
- Classify the failure as a writer issue (actionable by the writer) or a tooling issue (should be escalated to the Grove team). Writer issues include environment setup, syntax errors, incorrect imports, bad output files, and test logic bugs. Tooling issues include comparison utility crashes, parser errors in output matching utilities, missing data type support in comparison libraries, and infrastructure/CI pipeline failures. If the writer's code and output appear correct but the comparison utility can't match them, that's a tooling issue — tell the writer to report it with enough detail to reproduce.
- For writer issues, classify into one of these categories:
Connection Error
- Symptoms:
MongoServerSelectionError,ECONNREFUSED, timeout on connect - Diagnosis: Check if
.envexists withCONNECTION_STRING. Verify the connection string format. Suggest creating/updating the.envfile.
Output Mismatch
- Symptoms:
ComparisonResultfailure, "expected X but received Y", diff output - Diagnosis: Read both the expected output file and the actual output from
the test. Show a clear diff. Suggest one of these fixes:
- Chain
withIgnoredFields(...)for dynamic field values (e.g.,_id, timestamps) - Chain
withUnorderedSort()if order differs but content is correct - Update the output file if the actual output is correct
- Switch to
shouldResemble().withSchema(...)for highly variable output See the "Comparison API" section in the target language's conventions file for the full Expect API method signatures and ellipsis pattern reference.
- Chain
Missing Sample Data
- Symptoms: Test skipped with "Missing: sample_mflix" or similar, or empty results from queries against sample databases
- Diagnosis: Explain which sample database is needed. Suggest wrapping in
describeWithSampleData()oritWithSampleData()if not already.
Timeout
- Symptoms:
Exceeded timeout of 120000 ms, test hangs - Diagnosis: Check for missing
client.close()in the example's finally block. Check for unclosed cursors or change streams. For JavaScript/Node.js, also check for missingawaiton async operations — a missingawaitcan cause the test to hang without an obvious error. If the operation is expected to be slow (full-collection scan, multi-stage aggregation, or processing more than 10,000 documents), increase the specific test's timeout to 240000 ms viait('...', async () => {...}, 240000). Do not increase the global timeout injest.config.cjs.
Import / Module Error
- Symptoms:
Cannot find module,SyntaxError: Cannot use import,does not provide an export named - Diagnosis: Verify the import path matches the actual file location. Check that the example file exports the expected function name. Verify ES module syntax is used consistently.
Runtime Error
- Symptoms:
TypeError,ReferenceError, MongoDB operation errors - Diagnosis: Read the example code at the line referenced in the stack trace. Identify the root cause (wrong collection name, bad query syntax, missing await, etc.).
Step 5: Suggest or Apply Fixes
For each diagnosed failure:
Present the diagnosis clearly: what failed, why, and what to change
Propose a specific fix with the exact code change
Ask for approval before making changes, unless the fix is adding an Expect chain modifier for dynamic fields or sort order (e.g.,
withIgnoredFields,withUnorderedSort) — these can be applied directly, then re-run to confirm. See the "Comparison API" section in the target language's conventions file.Do not auto-update expected output files without asking. An output mismatch may indicate an idempotency issue (stale state between runs) or a logic bug, not just a stale output file. Investigate the cause first.
After applying fixes, re-run the failing test to confirm the fix works.
Step 6: Report
Provide a summary headed with Skill: grove-run:
- Total tests: X passed, Y failed, Z skipped
- For each failure: what was wrong and what was done to fix it
- Any remaining issues that need manual attention
Edge Cases
- Multiple failure types in one run: Diagnose each failure independently. Don't assume all failures have the same root cause — a connection error and an output mismatch in the same run are unrelated issues.
- Test hangs with no output: Likely a missing
client.close(), an unclosed change stream/cursor, or (in Node.js) a missingawait. Check the example's finally/defer block first, then check for unawaited async calls. - All tests skipped: Sample data is not loaded on the deployment. Explain
this and suggest
/grove-setupto check sample data availability. - Test passes locally but fails in CI: Investigate discrepancies between the local and CI environments. Common causes include: missing sample data, timing/timeout differences, datetime or timezone handling (CI may use a different timezone), and OS-level differences (e.g., locale, line endings). Do not increase the global timeout — increase only the specific test's timeout if the operation is genuinely slow.
- Failure in teardown, not in the test itself: Investigate the cause —
teardown failures can be blocking. Common causes include permissions issues
(the test user lacks
dropDatabaseprivileges) and incomplete state cleanup between tests (e.g., test A's teardown doesn't clean up what test B expects). Teardown errors withns not foundor similar "doesn't exist" messages are generally safe to ignore, but other teardown failures (permissions, timeouts, connection errors) should be diagnosed and fixed.