name: rye-tabular-intake description: Extract CSV and XLSX source tables for Rye-tracked imports. Use when a user needs to inspect tabular source files, emit row-level NDJSON with lineage, map source fields into destination records, group many source rows into parent records with source_set lineage, stage extracted data, or build a domain-specific intake skill on top of generic tabular primitives.
Rye Tabular Intake
Use this skill when source data starts in CSV or XLSX files and needs to be:
- inspected before mapping
- extracted into row-level NDJSON with stable source lineage
- mapped into destination-table shaped records
- grouped into parent records when many source rows describe one destination record
- staged as Rye tracking records before final database load
Conversation First
When the destination mapping is not already specified, use the file inspection output to drive a short mapping conversation with the user before writing transforms.
Confirm:
- which source sheet or table matters
- which source columns map to which destination fields
- what conversions are required
- which fields are required, optional, or defaulted
- whether one source row should emit one record or multiple records
Prefer a declarative JSON mapping config when the requested mapping is mostly column selection and coercion. Use a TypeScript mapping module when the logic is conditional, one-to-many, or depends on prior mapped output.
If the user is creating a domain-specific intake skill on top of this one, keep this skill generic and put source-column aliases, validation rules, destination-table choices, and domain examples in the consuming skill.
Workflow
- Inspect the file first:
node skills/rye-tabular-intake/scripts/tabular_inspect.mts --input data/customers.xlsx
- Extract rows as NDJSON:
node skills/rye-tabular-intake/scripts/tabular_extract.mts --input data/customers.xlsx --sheet Customers
- Configure mappings with the user, then choose one of:
- declarative config:
node skills/rye-tabular-intake/scripts/tabular_extract.mts --input data/customers.xlsx --sheet Customers | node skills/rye-tabular-intake/scripts/tabular_map.mts --config mappings/customers_to_contacts.json
- TypeScript module:
node skills/rye-tabular-intake/scripts/tabular_extract.mts --input data/customers.xlsx --sheet Customers | node skills/rye-tabular-intake/scripts/tabular_map.mts --module mappings/customers_to_contacts.mts
- declarative config:
- Stage extracted or mapped rows for Rye load tracking:
node skills/rye-tabular-intake/scripts/tabular_extract.mts --input data/customers.xlsx --sheet Customers | node skills/rye-tabular-intake/scripts/tabular_stage_rye.mts --node-type rye_tabular_intake_stage_row
- For many-to-one records, group extracted or mapped rows:
node skills/rye-tabular-intake/scripts/tabular_extract.mts --input data/interests.xlsx | node skills/rye-tabular-intake/scripts/tabular_group.mts --module mappings/interests_to_opportunities.mts
- For updates, appends, or agent-assisted merges, compare mapped records with a target-table snapshot:
node skills/rye-tabular-intake/scripts/tabular_change_plan.mts --input /tmp/mapped.ndjson --existing /tmp/existing-target.json --key contacts:external_id --mode merge_review > /tmp/change-plan.json
- Validate the import/change process before target writes when the consuming workflow needs an explicit gate:
node skills/rye-import-inspector/scripts/inspect_import_run.mjs --source /tmp/source.ndjson --mapped /tmp/mapped.ndjson --change-plan /tmp/change-plan.json --metadata /tmp/import-metadata.json --phase prewrite > /tmp/import-inspection.json
- Commit the intake trail into Rye:
node skills/rye-tabular-intake/scripts/tabular_commit_rye.mts --db-url "$DATABASE_URL" --input /tmp/source_rows.ndjson --run-id customer-import-2026-03-10- if only SQL execution is available:
node skills/rye-tabular-intake/scripts/tabular_commit_rye.mts --emit-sql --input /tmp/source_rows.ndjson --run-id customer-import-2026-03-10 > /tmp/rye-intake.sql
When It Writes
The pipeline is read-only until the commit step.
Local NDJSON, snapshot, change-plan, and SQL files are intermediate execution artifacts. Rye is the durable traceability record once tabular_commit_rye.mts writes run nodes, events, assertions, and source-file artifacts.
tabular_inspect.mts- reads CSV/XLSX and prints one JSON inspection document
tabular_extract.mts- reads CSV/XLSX and emits
source_rowNDJSON
- reads CSV/XLSX and emits
tabular_map.mts- reads NDJSON and emits
mapped_recordNDJSON
- reads NDJSON and emits
tabular_group.mts- reads NDJSON and emits grouped
mapped_recordNDJSON with multi-rowsource_setlineage
- reads NDJSON and emits grouped
tabular_change_plan.mts- reads mapped records plus an optional existing target-table snapshot and emits a read-only change-review plan
tabular_stage_rye.mts- reads NDJSON and emits
rye_stage_recordNDJSON
- reads NDJSON and emits
tabular_commit_rye.mts- reads NDJSON and writes Rye nodes, events, assertions, and artifacts into PostgreSQL
- with
--emit-sql, prints a SQL script instead of connecting to PostgreSQL
If the user wants to inspect, extract, map, or stage data without touching the database, stop before tabular_commit_rye.mts.
If the user has no DATABASE_URL but can execute SQL through a tool such as a SQL console or Supabase MCP, use tabular_commit_rye.mts --emit-sql, then execute the generated SQL in one call/session. The source files referenced by the NDJSON must still be readable locally when the SQL is generated so the tool can compute source hashes.
Runs And Duplicates
A run is created only when tabular_commit_rye.mts is called.
run_id- the identity of the run
- becomes the run node
external_id - can be any stable label such as
customers:extract:2026-03-10
run_fingerprint_sha1- the duplicate-detection key
- built from source file SHA1 values plus run-kind metadata
- used only to decide whether a new run should be rejected as a duplicate
These are different things:
- two different
run_idvalues can still be treated as duplicates if they produce the samerun_fingerprint_sha1 - extract, map, and stage runs over the same file are allowed because they produce different fingerprints
--allow-duplicate-sourcepermits a new run even when the fingerprint already exists
The duplicate check is database-wide for the connected Rye instance. If a later machine writes to the same Rye database and has the same source file bytes, the second commit is rejected unless --allow-duplicate-source is used.
Command Set
tabular_inspect.mts- discovers sheets/tables, row counts, header preview, sample rows
tabular_extract.mts- emits one
source_rowJSON object per data row
- emits one
tabular_map.mts- reads NDJSON from stdin or file and applies either a declarative JSON mapping config or a TypeScript transform module
tabular_group.mts- groups
source_rowormapped_recordinput and reduces each group into one or moremapped_recordoutputs - emits
source_setlineage for every source row that contributed to the grouped output
- groups
tabular_change_plan.mts- compares destination-table shaped
mapped_recordobjects with an existing target snapshot - classifies each planned row as
create,update,append,possible_merge,no_change, orneeds_review - treats blank or omitted mapped values as no change and never writes to the database
- compares destination-table shaped
tabular_stage_rye.mts- wraps extracted or mapped rows in a Rye-friendly staging envelope
tabular_commit_rye.mts- writes extracted, mapped, or staged records into Rye nodes, events, assertions, and source-file artifacts
- fingerprints original source files with SHA1 and rejects duplicate runs of the same run kind unless
--allow-duplicate-sourceis passed - can emit a transaction SQL script for SQL-only environments
rye-import-inspector- validates source rows, mapped records, change plans, metadata, old-value evidence, target table declarations, approvals, and post-write verification
- emits
rye_stage_recordvalidation reports that can be committed throughtabular_commit_rye.mts
Mapping Strategy
Use the lightest mapping mechanism that fits:
- declarative JSON config for conversationally defined column maps and conversions
- TypeScript module for difficult cases
TypeScript modules remain the escape hatch for:
- one source row to one destination record
- one source row to many destination records
- chained transforms over prior mapped output
- filtering rows by returning
null
Use tabular_group.mts when many source rows produce one destination record, such as invoice lines grouped into invoices or vetted interests grouped into acquisition opportunities.
Update And Change Review
Use tabular_change_plan.mts when a mapped import may update, append to, or merge with existing target-table records.
The change planner is table-independent. It only looks at mapped_record.destination_table, mapped record values, caller-supplied key fields, and a caller-supplied existing snapshot. It does not know about any destination database, API, or write path.
Existing snapshots may be JSON or NDJSON. Useful shapes include:
- a list of objects with
destination_tableandrecord - an object keyed by destination table name
- an object with generic wrappers such as
data,rows,records, orresults
Default policy:
- blank or omitted mapped values mean no change
- field clearing requires
--clear-nullsand explicit review - exact key collisions in append mode are classified as
needs_review - fuzzy or agent-assisted matches are classified as
possible_mergeorneeds_review - the command is read-only; final writes belong to the consuming domain skill
- before target writes, the consuming skill should record the source, mapped records, old values or target snapshot, change-plan outcome, approval, target tables, operation types, touched IDs, and verification result in Rye
Read references/cli-contract.md when you need:
- the NDJSON object contracts
- the mapping module API
- the declarative mapping config format
- example mapping modules
- guidance on staging records into Rye nodes/assertions/artifacts
- the distinct
rye_tabular_intake_*event, assertion, artifact, and node types - the JSON Schema contracts under
assets/schemas/
Read references/mapping-conversation.md when the user wants to configure mappings interactively in chat before you write the config or module.
Read references/extension-patterns.md when you need to create or evaluate a domain-specific skill that consumes these CLIs, especially for many-to-one grouped imports.
Read references/testing-fixtures.md when you need Docker-runnable fixture data for one-to-one, one-to-many, or many-to-one import scenarios.
Guardrails
- Inspect before extracting when the header row or target sheet is unclear.
- When column meaning is ambiguous, ask the user before hard-coding a conversion.
- Keep extraction lossless. Preserve source lineage and raw field names before coercing into destination shapes.
- Use
tabular_map.mtsfor deterministic transforms; avoid ad hoc one-off rewrites in chat when a reusable module is appropriate. - Use
tabular_group.mtsfor many-to-one reductions; keep domain-specific grouping rules in the consuming skill or mapping module. - Use
tabular_change_plan.mtsbefore rare update, append, or merge writes so due diligence is separate from final target-specific SQL or API calls. - Use
rye-import-inspectoras the generic validation gate before and after target writes; keep domain-specific policy in consuming skills. - Use Rye staging records to track intake status before writing final domain-table records.
- Prefer
tabular_commit_rye.mtswhen the user wants extraction and staging history stored in Rye itself. - Prefer
--emit-sqlwhen the available database interface can execute SQL but cannot provide a connection string. - Prefer pipelines that keep stdout machine-readable and stderr reserved for actionable errors.