name: project-version-curator description: Post-hoc audit, cleanup planning, and release/staging preparation for already messy project directories where many dated, v1/v2/v3, final/current, intermediate, figure, table, and script files are mixed together. Use when Codex is asked to audit existing version clutter, generate an inventory, detect duplicate/conflicting historical files, map files to accepted/candidate/legacy status using project indexes, or create a dry-run cleanup/release plan after the fact. For active development-time guardrails, new runs, promotion, baselines, or branching, use project-flow-guard instead.
Project Version Curator
Core principle
Never clean by deleting first. Treat the existing project as a working archive, then create a separate release/staging area with a manifest, provenance, and checksums.
This is a post-hoc curator. It must respect the active project maintenance framework rather than replace it:
PROJECT_STATE.mdis the thin state entry point.RESULTS_INDEX.mdis more authoritative than filenames for current/candidate/legacy results.DECISIONS.mdexplains why a result/route is accepted, legacy, or superseded.DATA_ASSETS.mdexplains authoritative data paths and copy restrictions.RUNS_INDEX.tsv,.project_flow/, andRUN_MANIFEST.jsondescribe generated-run provenance.
Canonical curation outputs should be Markdown, TSV, and JSON. SQLite/HTML may be generated for query/display only, not as canonical cleanup sources, unless the user explicitly changes policy.
Default stance:
- Inventory first.
- Cross-check project indexes before trusting filename hints such as
finalorcurrent. - Classify files as accepted/candidate/legacy/superseded/intermediate/archive/temporary.
- Detect conflicting versions.
- Create a release/staging folder by copy or symlink.
- Only delete/quarantine after explicit user approval.
Recommended workflow
1. Read project state
If PROJECT_STATE.md exists, read it before changes. If the project uses a state policy, update it after meaningful changes.
Also read these if present and relevant:
RESULTS_INDEX.mdDECISIONS.mdDATA_ASSETS.mdRUNS_INDEX.tsv.project_flow/FILE_REGISTRY.tsv.project_flow/PROMOTIONS.tsv.project_flow/RUNS.tsv
Treat RESULTS_INDEX.md / current/ / registry entries as stronger evidence than file names like final, latest, or current_v3.
2. Generate an inventory
Use the bundled script:
python ~/.codex/skills/project-version-curator/scripts/project_version_curator.py inventory \
--root . \
--out project_inventory.tsv
Add hashes only for final packaging or duplicate analysis because it is slower:
python ~/.codex/skills/project-version-curator/scripts/project_version_curator.py inventory \
--root . \
--out project_inventory_with_sha256.tsv \
--sha256
3. Detect version conflicts
python ~/.codex/skills/project-version-curator/scripts/project_version_curator.py conflicts \
--manifest project_inventory.tsv \
--out version_conflicts.tsv
Review groups with many files sharing a similar semantic key, especially files containing final, current, v\d+, or dates.
Important: final in a filename is only a filename hint. Prefer accepted, release, legacy, or superseded states from indexes/registries when available.
4. Create a publication/release skeleton
Dry run first:
python ~/.codex/skills/project-version-curator/scripts/project_version_curator.py init-release \
--root . \
--name publication_release_YYYYMMDD
Apply:
python ~/.codex/skills/project-version-curator/scripts/project_version_curator.py init-release \
--root . \
--name publication_release_YYYYMMDD \
--apply
For non-publication cleanup planning, produce a dry-run curation plan instead of moving files:
CURATION_PLAN.md # human-readable plan: keep/promote/archive/quarantine candidates
project_inventory.tsv # complete inventory
version_conflicts.tsv # conflict groups
curation_summary.json # optional machine-readable summary
5. For future active development, prefer project-flow-guard
This skill is for post-hoc audit and cleanup planning. For active development, use:
python ~/.codex/skills/project-flow-guard/scripts/project_flow_guard.py start-run \
--root . \
--task "task name"
Directory policy for scientific publication projects
Use this top-level separation:
project/
├── raw/ or reference_library/ # immutable input/reference data
├── runs/ # timestamped analysis attempts
├── current/ # small stable pointers/copies to accepted outputs
├── RESULTS_INDEX.md # human entry point for accepted/candidate/legacy outputs
├── DECISIONS.md # decision log
├── DATA_ASSETS.md # data source/path/copying rules
├── RUNS_INDEX.tsv # concise run index, if used
├── publication_release_YYYYMMDD/ # clean paper/data package
├── scripts/ # maintained scripts, not one-off throwaways
├── docs/ # methods, decisions, figure notes
├── archive/ # old but intentionally retained material
└── PROJECT_STATE.md
For a publication release:
publication_release_YYYYMMDD/
├── README.md
├── DATA_DICTIONARY.md
├── MANIFEST.tsv
├── CHECKSUMS.sha256
├── 00_metadata/
├── 01_input_data/
├── 02_processed_data/
├── 03_final_tables/
├── 04_figures/
├── 05_scripts/
├── 06_environment/
└── 99_archive_index/
Naming rules
Prefer deterministic names:
<analysis>__<dataset>__<parameter>__<status>__<YYYYMMDD>.<ext>
Examples:
osc_ssn__myxo11refs__score360__final__20260430.svg
osc_phylogeny__all201__fasttree_lg_gamma__final__20260430.pdf
osc_motif_alignment__all201__hslss_windows__final__20260430.tsv
Rules:
- Use one date token at the end.
- Avoid stacking
final_final,new_final,current_v3_final. currentshould be a symlink, tiny pointer, or explicitly promoted copy, not a new independent version.- Keep old attempts in
runs/; do not mix them with accepted outputs. - Every accepted/release figure/table must have a source script and source data path in a manifest or index.
- Do not introduce new
final_*names during cleanup; map oldfinalfiles toaccepted,release,legacy, orsuperseded.
Promotion states
Use these statuses consistently:
draft: exploratory, not trusted.candidate: plausible accepted output, needs review.accepted: selected for manuscript.release: copied into publication package.archive: retained for provenance but not part of the paper.legacy: old route retained for comparison/provenance, not the active result.superseded: replaced by a newer accepted/candidate output.
Avoid using final as a new status. If old filenames contain final, treat it as a clue to audit, not proof of acceptance.
Handoff updates
After a meaningful curation or release-planning turn:
- Update
PROJECT_STATE.mdwith only a concise summary and next step. - Update
RESULTS_INDEX.mdif accepted/candidate/legacy result entry points changed. - Update
DECISIONS.mdif a cleanup/release/legacy decision was made. - Never delete/quarantine files without explicit user approval after a dry-run plan.
Bundled resources
scripts/project_version_curator.py: inventory, conflict detection, release skeleton creation, and new run directory creation.references/versioning_policy.md: more detailed versioning and release rules.