docs-writer - SKILL.md Agent Skill

name: docs-writer description: Use when writing or editing pages for this Mintlify docs site, especially when the change involves code snippets that must be generated via the test → MDX pipeline. Also triggers when the user asks to cross-check docs against a specific source repo (e.g., "verify against the lancedb repo").

LanceDB Docs Snippet Pipeline

Code examples on docs pages are not written directly into MDX. They live inside real, runnable tests under tests/{py,ts,rs}/, get extracted by scripts/mdx_snippets_gen.py into auto-generated modules under docs/snippets/, and are imported into MDX pages as named constants. Follow the pipeline below — otherwise your changes will either render stale or be overwritten on the next regen.

Golden rules

Never hand-edit docs/snippets/*.mdx. Every file there is auto-generated. The header says Auto-generated by scripts/mdx_snippets_gen.py. Do not edit manually. — believe it.
Edit the test source instead. Then regenerate.
Snippets must be inside passing tests. They're extracted from real pytest/vitest/cargo tests. If the test doesn't run, the example is wrong.
Cross-check every non-trivial API claim against the source repo. If the user names a repo (e.g., "check in the sophon repo", "verify against lancedb"), that repo is the source of truth — grep it, cite the file + line, and let the code override prior assumptions. See Cross-checking docs against source repos for resolution rules.

Make it sound like a human

LLM writing at times feels very formulaic, using very similar phrasing. The goal is to make the docs feel approachable and human, not like a dry manual that was written by a robot. Avoid repeating the same sentence structures, vary your word choice, and inject a bit of personality where appropriate. The content should be clear and accurate, but also engaging to read.

When closing audit findings, keep the fix as small as the reader's need allows. Fold related gaps into existing paragraphs, notes, or lists; avoid adding a new section for every missing edge case. Favor usage-critical guidance, prerequisites, and common failure modes over exhaustive parameter inventories. The docs should help users move confidently, not force them through implementation minutiae.

Avoid the following extremely common patterns:

"It's not this, it's that."
"Paying the ___ tax" (e.g., "paying the import tax", "paying the setup tax") − the words "pay" and "tax" are heavily overused by AI
"LanceDB changes that" − the phrase "changes that" is a common AI crutch
"That matters" - state the consequence directly rather than using this overused phrase

Pipeline at a glance

tests/py/test_indexing.py   ──┐
tests/ts/indexing.test.ts   ──┼──►  make snippets  ──►  docs/snippets/indexing.mdx  ──►  import in docs/*/page.mdx
tests/rs/indexing.rs        ──┘

File-name mapping

The output module is the test file stem with test_ stripped (Python), .test stripped (TS), or bare (Rust):

Source file	Generated module
`tests/py/test_indexing.py`	`docs/snippets/indexing.mdx`
`tests/ts/connection.test.ts`	`docs/snippets/connection.mdx`
`tests/rs/basic_usage.rs`	`docs/snippets/basic_usage.mdx`

(Authoritative logic: normalize_target_name in scripts/mdx_snippets_gen.py.)

Marker syntax

Delimit each snippet with language-appropriate comment markers:

Language	Start	End
Python	`# --8<-- [start:snippet_name]`	`# --8<-- [end:snippet_name]`
TypeScript / Rust	`// --8<-- [start:snippet_name]`	`// --8<-- [end:snippet_name]`

snippet_name is snake_case and must be unique across the whole repo (the generator errors on duplicates).
Keep the body of the snippet minimal — the reader sees exactly what's between the markers. Setup (fixtures, data creation, assertions) goes outside the markers.

Export name formula

The generator derives the MDX export name as {Prefix}{TitleCase(snippet_name)} where the prefix is Py, Ts, or Rs.

vector_index_nprobes (Python) → PyVectorIndexNprobes
connection_setup (TypeScript) → TsConnectionSetup
basic_usage (Rust) → RsBasicUsage

Regenerating

Preferred:

make snippets      # all three languages
make py            # Python only
make ts            # TypeScript only
make rs            # Rust only

make invokes uv run scripts/mdx_snippets_gen.py -s tests/{lang}. If uv run fails in your sandbox (see top-level CLAUDE.md), fall back to activating the local venv and running the script directly:

source .venv/bin/activate
python scripts/mdx_snippets_gen.py -s tests/py   # and -s tests/ts, -s tests/rs

Always run the test suite for the language you edited before regenerating:

source .venv/bin/activate
python -m pytest tests/py/test_indexing.py

(TS/Rust suites have their own runners — check tests/ts/package.json and tests/rs/Cargo.toml.)

Pre-PR checklist

Before opening (or updating) a PR, always run make snippets from the repo root so the regenerated docs/snippets/*.mdx files land in the same commit as the test changes that produced them. Otherwise the PR will ship stale snippets and CI (or a later regen run) will flag a diff.

cd /Users/prrao/code/docs   # or wherever the repo is checked out
make snippets
git status                  # confirm any regenerated docs/snippets/*.mdx are staged

If git status shows modified files under docs/snippets/ after make snippets, stage and commit them alongside your test and MDX changes — do not push a PR with an un-regenerated snippets tree.

Importing a snippet into a docs page

Use the absolute /snippets/... path (not relative) and alias the long Py|Ts|Rs export names:

import {
    PyVectorIndexNprobes as VectorIndexNprobes,
    TsVectorIndexNprobes,
} from '/snippets/indexing.mdx';

Then render with a CodeGroup:

<CodeGroup>
    <CodeBlock filename="Python" language="python" icon="python">
    {VectorIndexNprobes}
    </CodeBlock>
    <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsVectorIndexNprobes}
    </CodeBlock>
    <CodeBlock filename="Rust" language="rust" icon="rust">
    {RsVectorIndexNprobes}
    </CodeBlock>
</CodeGroup>

Icon conventions: python, square-js (TS), rust.

Language parity

All three test trees are actively populated. If the API you're documenting exists in every language binding, add snippets in all three. Python-only is acceptable when the feature is Python-specific (e.g., a Pydantic/PyArrow integration), but call it out explicitly in the prose rather than silently omitting other languages.

Cross-checking docs against source repos

Docs must reflect what the code actually does. Before writing or editing any claim about an API — parameter names, defaults, return types, method existence, behavior under edge cases — verify it against the source repo.

If the user names a repo, use that repo as the source of truth. Phrasings like "check in the sophon repo that the code shows this", "verify against lancedb", or "cross-reference the geneva repo" are explicit instructions to ground the docs in that codebase. Treat them as required, not optional.

Resolution order for a repo name the user mentions:

Sibling checkout of /Users/prrao/code/docs: e.g., ../lancedb, ../sophon, ../geneva. Check with ls ../ first.
Anywhere under /Users/prrao/code/: try ls /Users/prrao/code/ | grep -i <name>.
If neither exists locally, ask the user for the path (or a clone URL) before proceeding — do not guess or substitute a different repo.

Once located, Grep for the symbol, parameter, or behavior and cite the file + line in your response so the user can audit the check. If the repo and the docs disagree, the repo wins — update the docs (or flag the discrepancy to the user if it looks like a real bug).

Known repos and typical surfaces to grep (extend as you learn new ones):

Repo	Common source paths
`lancedb`	`python/python/lancedb/` (Python), `nodejs/` (TS), `rust/` (Rust core)
`sophon`	ask user on first encounter; record the path here once confirmed
`geneva`	ask user on first encounter; record the path here once confirmed

General rule: even when the user does not name a repo, if you're documenting a non-trivial API surface and a plausible source repo is available locally, cross-check proactively rather than relying on memory. This catches hallucinated parameters before they ship.

Mintlify components

Prefer these over ad-hoc emphasis:

<Note> — general notable info
<Tip> — actionable suggestion
<Info> — background / context
<Warning> — pitfalls, perf caveats
<Badge> — e.g., <Badge color="red">Enterprise-only</Badge>
<Card> — linked call-out to an external notebook or page
<CodeGroup> / <CodeBlock> — multi-language code (always for runnable examples)

Common mistakes this skill prevents

Editing docs/snippets/*.mdx directly and losing the change on the next make snippets.
Inventing export names that don't match the {Py|Ts|Rs}{TitleCase} formula.
Using a relative import path like '../snippets/indexing.mdx' (must be '/snippets/indexing.mdx').
Writing Python-only examples for a feature that exists in all three SDKs.
Documenting parameters that don't exist in the source repo (e.g., ../lancedb, ../sophon, ../geneva).
Ignoring a user's explicit instruction to cross-check against a named repo — that instruction is mandatory, not advisory.
Running python scripts/mdx_snippets_gen.py without the -s tests/<lang> flag (the script needs a source dir).