name: testing-d-research-scripts
description: Verify the d-research-skill helper scripts (Node + Python). Use after editing any file under scripts/, after regenerating a script, or before publishing a new version of the skill.
Testing D Research Skill Scripts
When to use this sub-skill
Run these checks whenever you have:
- edited a file under
scripts/ - regenerated or replaced a helper script
- bumped a dependency in
package.json - prepared a release of
d-research-skill
The same checks run automatically in CI on every pull request (see "CI" below), so local runs are mainly to fail fast before pushing.
Prerequisites
- Node.js 18+ (for
*.mjsscripts andnpm run) - Python 3.9+ (for
*.pyscripts; stdlib only — nopip installneeded) pandoc >= 2.11(only for thecitation_render.pyself-test; the script degrades cleanly when pandoc is missing)- Optional:
npx playwright install— only needed for real-world browser runs, not for any self-test
No external API keys are required. Every self-test runs offline.
Quick validation: one command
From the repo root:
npm run self-test
This is the same chain CI runs. It executes every script's offline self-test in sequence and exits non-zero on the first failure. Pass criteria: exit code 0 and the final command (check_internal_refs.py) prints OK: all backticked internal refs resolve.
Quick validation: individual scripts
If you want to isolate a failure, run scripts one at a time. The repo ships 35 files in total: 33 research helpers (each with an offline --self-test: Python research utilities, 6 top-level Node scripts, plus 1 Node helper at scripts/lib/http_cache.mjs) plus 2 pre-commit utility scripts (check_node_syntax.py, check_no_plan_files.py) that run as checks rather than self-tests. run_python.mjs is a thin wrapper.
Node scripts (6 top-level + 1 helper)
node scripts/playwright_probe.mjs --self-test # → "playwright_probe self-test ok"
node scripts/playwright_extract.mjs --self-test # → "playwright_extract self-test ok"
node scripts/playwright_crawl.mjs --self-test # → "playwright_crawl self-test ok"
node scripts/api_fetch.mjs --self-test # → 4× "✓ PASS" (parseArgs, Link header, cursor, offset)
node scripts/lib/http_cache.mjs --self-test # → "http_cache.mjs self-test ok"
node scripts/web_search.mjs --self-test # → "web_search self-test ok"
Python scripts (24)
python3 scripts/evidence_ledger.py self-test # → "evidence_ledger self-test ok" (incl. tamper detection)
python3 scripts/data_clean.py self-test # → "ALL TESTS PASSED" (5 subtests: clean/stats/dedup/validate/merge)
python3 scripts/citation_export.py self-test # → "All self-tests passed!" (6 subtests)
python3 scripts/citation_render.py self-test # → "All self-tests passed!" (incl. pandoc integration)
python3 scripts/extract_tables.py self-test # → "All self-tests passed!" (5 subtests)
python3 scripts/score_source.py self-test # → "All self-tests passed!" (4 subtests)
python3 scripts/research_plan.py self-test # → "OK: research_plan self-test passed (NN sub-tests)."
python3 scripts/run_dogfood.py self-test # → "OK: eval benches valid; dogfood-bench.json: 12 tasks, frontier-bench.json: 52 tasks."
python3 scripts/pdf_extract.py self-test # → "pdf_extract self-test ok"
python3 scripts/wayback.py self-test # → "wayback self-test ok"
python3 scripts/wikidata.py self-test # → "wikidata self-test ok"
python3 scripts/social_snapshot.py self-test # → "social_snapshot self-test ok"
python3 scripts/citation_resolver.py self-test # → "citation_resolver self-test ok"
python3 scripts/report_render.py self-test # → "report_render self-test ok"
python3 scripts/ocr.py self-test # → "ocr self-test ok"
python3 scripts/translate.py self-test # → "translate self-test ok"
python3 scripts/embed_corpus.py self-test # → "embed_corpus self-test ok"
python3 scripts/citation_graph.py self-test # → "citation_graph self-test ok"
python3 scripts/multi_extract.py self-test # → "multi_extract self-test ok"
python3 scripts/dedup_near.py self-test # → "dedup_near self-test ok"
python3 scripts/http_cache.py self-test # → "http_cache self-test ok"
python3 scripts/bench_harness_check.py self-test # → "bench_harness_check self-test ok"
python3 scripts/run_metadata.py self-test # → "run_metadata self-test ok"
python3 scripts/harvest_terms.py self-test # → "harvest_terms self-test ok"
python3 scripts/check_internal_refs.py # → "OK: all backticked internal refs resolve."
python3 scripts/check_internal_refs.py --decision-tree # → "OK: every references/*.md is reachable from the decision tree."
Pre-commit utility scripts (checks, not self-tests)
Two stdlib helpers exist solely to drive cross-platform pre-commit hooks. They have no self-test subcommand because they are checks themselves.
python3 scripts/check_node_syntax.py # → runs `node --check` on every .mjs; exit 0 on success
python3 scripts/check_no_plan_files.py README.md package.json # → exit 0 (no PLAN file in the list)
python3 scripts/check_no_plan_files.py PLAN-foo.md # → exit 1, prints "blocked: PLAN-foo.md"
On Windows, use python if python3 is not on PATH.
Pass criteria (universal)
- Exit code
0for every command - Output contains a positive marker:
ok,PASS,ALL TESTS PASSED,All self-tests passed!, orOK: … - No Python tracebacks, no
FAIL, no unhandled-promise warnings from Node
The evidence_ledger.py self-test intentionally exercises the tamper-detection path; a TAMPER DETECTED line in the middle of its output is expected and is followed by the success marker.
Real-world smoke tests (optional)
These hit live public APIs. Use them to verify network paths after a script change, not as gates for CI.
api_fetch.mjs — OpenAlex
node scripts/api_fetch.mjs \
--url "https://api.openalex.org/works?search=machine+learning&per_page=5" \
--max-pages 1 \
--out openalex.json
Expected: openalex.json is a JSON array; each item has id, title, and (usually) doi.
data_clean.py — CSV dedup
python3 scripts/data_clean.py clean --file input.csv --out cleaned.csv
Expected: duplicates collapsed, whitespace normalized, ISO 8601 dates.
citation_export.py — BibTeX export
python3 scripts/citation_export.py export \
--file evidence.csv --format bibtex --out refs.bib
Expected: refs.bib contains @misc{ (or @article{) entries with title = {…} and url = {…} fields.
For the full list of npm shortcuts (npm run probe, npm run plan:render, npm run citation:render, …) see the "npm scripts" section of README.md.
CI
Two GitHub Actions workflows replicate these checks on every pull request:
.github/workflows/lint-and-self-test.ymlruff check scripts/— Python lintnode --checkon everyscripts/*.mjs— JS syntaxnpm run self-test— every offline self-test (with pandoc installed forcitation_render)
.github/workflows/link-check.ymlscripts/check_internal_refs.py— backticked in-repo path referenceslychee --offlineon all markdown — standard[text](url)link integrity- A weekly
lychee-externaljob (non-blocking) validates external URLs
If any of the 33 research-helper self-tests (or the four supplementary checks: check_internal_refs.py, check_internal_refs.py --decision-tree, check_node_syntax.py, check_no_plan_files.py) fail locally, the same failure will block the PR. Fix locally before pushing.
Common failure modes
| Symptom | Likely cause | Fix |
|---|---|---|
ImportError or ModuleNotFoundError |
Generated script missing a stdlib import | Add the import; re-run python3 -c "import py_compile; py_compile.compile('scripts/<name>.py', doraise=True)" |
ERR_UNKNOWN_FILE_EXTENSION ".py" |
Tried to invoke .py with node directly |
Use python3 (or node scripts/run_python.mjs scripts/<name>.py …) |
Pandoc-related FAIL in citation_render |
Pandoc not installed or < 2.11 |
Install pandoc; the self-test will skip the pandoc-dependent subtest if pandoc is genuinely missing |
playwright_* self-test hangs |
Real browser launch attempted | Self-tests must not require a browser; check the script wasn't edited to drop the offline branch |
check_internal_refs.py reports a missing path |
A markdown file backticks an in-repo path (e.g. a reference, adapter, template, or script) that no longer exists | Update the reference, restore the file, or remove the link |
| Eval bench schema error | New task missing required keys or frontier/refusal rule violation | Compare against an existing task; required keys are listed in docs/eval.md |
Adding a new script
When you add a new helper to scripts/:
- Implement an offline
self-test(Python) or--self-test(Node) subcommand. CI runs offline, so any network dependency must degrade cleanly. - Append the new self-test to the chained
self-testscript inpackage.jsonso it runs in CI. - Add the script to
SKILL.md's "Optional bundled scripts" list and link it from a reference doc. - Update the script-count notes in
README.mdif the total changes. - Re-run
npm run self-testlocally before opening a PR.
See CONTRIBUTING.md for the full conventions (argparse, shebangs, error formatting, etc.).
See also
SKILL.md— main entry point of d-research-skill (decision tree)package.json— full list ofnpm runshortcuts and the chainedself-testdefinitionCONTRIBUTING.md— conventions for adding references, adapters, examples, scripts, and templatesdocs/eval.md— eval-harness usage guide forrun_dogfood.py.github/workflows/— the two CI workflows that mirror these checks