rocky-test-fixtures

star 265

Dagster integration test fixture lifecycle. Use when regenerating live-binary fixtures via `just regen-fixtures` / `scripts/regen_fixtures.sh`, debugging why a generated fixture differs across runs, or adding scenario data for new CLI commands. Covers determinism (zeroed timings, sentinel timestamps), the relationship between `tests/scenarios.py` and `tests/fixtures_generated/`, and the two POCs that back the corpus.

rocky-data By rocky-data schedule Updated 6/12/2026

name: rocky-test-fixtures description: Dagster integration test fixture lifecycle. Use when regenerating live-binary fixtures via just regen-fixtures / scripts/regen_fixtures.sh, debugging why a generated fixture differs across runs, or adding scenario data for new CLI commands. Covers determinism (zeroed timings, sentinel timestamps), the relationship between tests/scenarios.py and tests/fixtures_generated/, and the two POCs that back the corpus.

Dagster test fixture regeneration

The dagster integration has two parallel sources of test data with different roles. Conflating them causes confusion, so this skill is worth reading before touching either.

The two test-data surfaces

Path Role Edit policy
integrations/dagster/tests/scenarios.py Source of truth for parsing/translator/component tests. Hand-crafted Python dicts exposed as *_json pytest fixtures via conftest.py (which json.dumps each scenario). Edit by hand. Add new scenarios when introducing a new CLI command or covering a new shape.
integrations/dagster/tests/fixtures_generated/ Live-binary corpus. Captured from the running rocky binary against the playground POCs. Used for drift detection — test_generated_fixtures.py re-validates that every captured JSON still parses against the current Pydantic models. Regenerated by just regen-fixtures. Don't hand-edit — the next regen overwrites your changes.

Tests in test_types.py, test_translator.py, test_component.py, etc. consume the *_json fixtures backed by scenarios.py. The generated corpus is exercised solely by test_generated_fixtures.py to catch shape drift between the Rust binary and the Pydantic models.

A legacy tests/fixtures/*.json directory previously held the parsing-test corpus; it was removed once scenarios.py became the source of truth. See conftest.py for the migration note. The --in-place flag in scripts/regen_fixtures.sh still targets that directory and is effectively a no-op today.

When to use this skill

  • Changing any *Output struct that affects a fixture's shape (pairs with the rocky-codegen skill — codegen updates the Pydantic models; the captured fixtures should still parse against them)
  • Adding a brand new CLI command that needs a parsing test in test_types.py (you'll add a scenarios.py entry, not a JSON file)
  • Investigating why test_generated_fixtures.py fails after a binary or schema change
  • Debugging non-determinism in a captured fixture (timestamps, durations, row counts)

Regeneration workflow

just regen-fixtures
# or
./scripts/regen_fixtures.sh

Writes to integrations/dagster/tests/fixtures_generated/. The captured JSON should git diff cleanly except for fields that legitimately changed; non-trivial diffs warrant inspection.

After regenerating, always run from integrations/dagster/:

uv run pytest tests/test_generated_fixtures.py -v

This catches any captured-output shape that no longer parses against the current Pydantic models.

Prerequisites

The script hard-fails if either is missing:

  1. Release binary at engine/target/release/rocky — build with:

    cd engine && cargo build --release --bin rocky
    

    or run just codegen (which also builds in release mode and shares the artifact).

  2. duckdb CLI on $PATH — used to seed the playground POC's DuckDB file:

    brew install duckdb        # macOS
    

Two POCs back the corpus

regen_fixtures.sh runs the binary against two playground POCs, captured into different subdirectories of fixtures_generated/:

POC Purpose Fixtures produced
examples/playground/pocs/00-foundations/01-replication-basics/ Baseline full_refresh replication pipeline. Produces most fixtures. (00-playground-default is now a transformation pipeline; this is its verbatim replication clone, kept as the fixture source so captures stay byte-identical.) discover.json, plan.json, run.json, state.json, compile.json, test.json, ci.json, lineage.json, doctor.json, history.json, metrics.json, optimize.json, etc.
examples/playground/pocs/02-performance/03-partition-checksum/ time_interval (partition-keyed) pipeline. Exercises partition-specific shapes. partition/compile.json, partition/run_single.json, partition/run_backfill.json, partition/run_late.json

If you need a fixture shape that neither POC produces naturally (e.g. drift.json from a non-zero schema diff), add it as a scenarios.py entry rather than coercing the POC into producing it.

Determinism: why captured JSON is stable across runs

Without post-processing, every regen would produce a slightly different diff because Rocky stamps:

  • Duration fields with real wall-clock elapsed time (_ms, _seconds, _secs suffixes)
  • Timestamp fields with now() (updated_at, started_at, finished_at, timestamp, captured_at)

The capture helper inside regen_fixtures.sh runs a Python normalizer (scripts/_normalize_fixture.py) over each captured JSON file that:

  • Zeroes any numeric field whose key ends in _ms / _seconds / _secs
  • Replaces any string field whose key is in the wall-clock set with "2000-01-01T00:00:00Z"

Important invariants the normalizer preserves:

  • last_value (watermarks) — NOT touched. It's a logical value derived from seeded data and is genuinely deterministic across runs.
  • Anything inside a data payload — NOT touched. Only top-level wall-clock fields get the sentinel.

If you add a new timing or timestamp field to an *Output struct, you may need to extend TIMING_SUFFIXES or WALL_CLOCK_FIELDS in the normalizer. Otherwise your new fixture will drift on every regen.

Exit-code tolerance

The capture helper tolerates non-zero exit codes as long as stdout parses as JSON. This is intentional:

  • rocky doctor exits 1 when any check is warning, 2 when critical — and both still emit valid JSON.
  • rocky run can exit non-zero on partial success and still emit a valid RunOutput.

The dagster integration's allow_partial=True path mirrors this tolerance in production — see integrations/dagster/src/dagster_rocky/resource.py.

Adding parsing-test coverage for a new CLI command

For a new CLI command whose output the dagster tests need to parse:

  1. Rust schema exists (from the rocky-codegen skill).
  2. Pydantic binding exists (from just codegen-sdk).
  3. Add a scenarios.py entry with a representative output dict, then expose it via a *_json pytest fixture in conftest.py.
  4. Add a parsing test in test_types.py that loads the fixture and asserts on key fields.
  5. Optionally, add a capture call in scripts/regen_fixtures.sh so the playground POC contributes a live-binary sample under fixtures_generated/. Run just regen-fixtures afterwards.

If the playground POC can't produce the command naturally (e.g. an adapter-specific command that needs Databricks), skip step 5 — the scenarios.py entry is the test source of truth.

Debugging fixture drift

Symptoms and causes:

Symptom Likely cause
test_generated_fixtures.py fails with ValidationError Pydantic model changed but captured fixture is stale — run just regen-fixtures
fixtures_generated/<name>.json differs by a single _ms field Normalizer missed a new duration field — extend TIMING_SUFFIXES
Fixture differs wildly between runs Playground POC has non-deterministic seed data (rare) or a new timestamp field — extend WALL_CLOCK_FIELDS
capture prints ==> <name> but the file is empty Command exited non-zero AND wrote to stderr, not stdout — check the command manually
rocky binary not found at engine/target/release/rocky Missing release build — run cd engine && cargo build --release --bin rocky or just codegen
Partition fixtures missing but main ones present 02-performance/03-partition-checksum POC missing or broken — check ls examples/playground/pocs/02-performance/03-partition-checksum/

Reference files

  • scripts/regen_fixtures.sh — the capture pipeline + POC list
  • scripts/_normalize_fixture.py — wall-clock + timing normalizer
  • integrations/dagster/tests/scenarios.py — hand-crafted Python dict scenarios (test source of truth)
  • integrations/dagster/tests/conftest.py*_json fixture exposure
  • integrations/dagster/tests/test_generated_fixtures.py — the test that guards the captured corpus
  • integrations/dagster/CLAUDE.md — "Adding support for a new Rocky CLI command" checklist (includes the scenarios step)
  • justfile — the regen-fixtures recipe (wraps the script)

Related skills

  • rocky-codegen — the Rust → Pydantic/TS cascade that fixtures are validated against
  • rocky-new-cli-command — the end-to-end checklist that includes scenario/fixture creation
  • rocky-poc — authoring the playground POCs that the captured fixture corpus depends on
Install via CLI
npx skills add https://github.com/rocky-data/rocky --skill rocky-test-fixtures
Repository Details
star Stars 265
call_split Forks 12
navigation Branch main
article Path SKILL.md
More from Creator