name: tzlint-processors description: How to add an input-format processor (e.g. CSV, Re:VIEW) to TsuzuLint. Read this before adding support for a new file type or changing the processor seam.
Adding an input-format processor
TsuzuLint lints natural-language text in any format via a processor seam. A Processor
turns source into either lintable regions (the easy path) or a full AST (the escape
hatch for structure-aware rules). The core handles parsing, span-rebasing, per-region rule
resolution, caching, and fix. See docs/design/input-format-processors.md for the full contract.
Decide: Regions or Ast?
- Return
Parsed::Regions(preferred) when you only need to lint the format's prose. You return the byte ranges of the natural-language text plus aRegionTagand aParseMode. The core does the rest. This is the lowest-friction path — no AST construction, no span math. - Return
Parsed::Astonly when the format's own structure must be visible to rules (headings, lists, code blocks of the source format). You build the frozenAstyourself (text = whole source, absolute spans). Markdown uses this path.
Steps
- Create the module
crates/tzlint_core/src/processor/<format>.rs. ImplementProcessor:extensions()(dot-less, lowercase) andparse(source, cfg) -> Result<Parsed, ParseError>.- For prose extraction, build
Region { slices, tag, parse_mode }. Eachslicemust be a contiguous byte range ofsource; the core parses it independently and rebases spans. - Use
RegionTag { kind: Some("<your-kind>"), index, name }so config can target rules at regions. Never invent a CSV-specific concept in shared code —kindis your namespace. - Never
unwrap/expect/panic!/unreachable!(clippy-denied). Slice withsource.get(a..b).unwrap_or("").
- For prose extraction, build
- Register it in
Registry::with_builtins(processor/mod.rs) — the single wiring point. Add a guard-test extension entry (Task 4.3) so the registry list and the test stay in sync. - Config (if needed): if your format needs options (delimiter, columns, …), add a resolved
shape under
crates/tzlint_core/src/config/and aRawFormat-style serde model inconfig/model.rs, then build aProcessorConfigfor it incrates/tzlint_cli/src/rules.rs(processor_config_for). Per-region rules flow throughregion_rules_forautomatically when your regions carry akind/name/indexthe config can match. - Tests (TDD): a unit test for your extraction (assert absolute spans of the regions), and
an
app.rsintegration test that lints a fixture through the CLI. - Docs: add a short section to
docs/processors.mddescribing the format's config and any caveats.
Invariants you must keep
- Spans are absolute byte offsets into the original source;
Ast.text(for the Ast path) is the whole source. - The frozen
AstCoreV1is not extended — region identity lives inRegionTag, outside the AST. - All file I/O stays in the CLI/
Host; a processor only ever sees an in-memory&str. - One dispatch path: everything goes through
lint_document; don't add a parallel lint route.