name: modeler-decision-logger description: Collect, stamp, and freeze the human modeler's decisions into one canonical per-subquestion decision log. The decision version of frozen_numbers.json — it does NOT originate decisions, it makes the human's judgments traceable, append-only, and the single source every downstream narrative is transcribed from. license: MIT
Purpose
Maintain one canonical, append-only log of the human modeler's decisions, so that every modeling judgment a contest grades — which method and why, whether a result is good, what a number means, how confident the team is — has a single traceable home, and every downstream narrative (final method explanation, solution package, paper method-selection prose) is transcribed from it, never re-invented.
This is the decision-side parallel of the Frozen Numbers Convention. Just as frozen_numbers.json pins numbers so a bug fix can't silently shift the paper, the decision log pins judgments so the AI can't silently author what a judge grades, and so four different skills can't each re-compose the method-selection story and drift apart.
This skill does NOT make decisions. It never fills modeler_decision or modeler_rationale. It collects the human-authored decision artifacts that judgment-bearing skills emit (methods/Qx/decisions/<skill>_modeler_decision.md), validates them mechanically, stamps them, and folds them into the canonical log. It pre-fills only the mechanical fields (decision_id, options_considered, evidence, ai_suggestion).
When to use
Use this skill:
- After a judgment-bearing skill (
method-selector; laterresult-report-generator,robustness-checker,final-method-explainer,solution-package-builder) has captured a human decision in its*_modeler_decision.mdartifact, to fold it into the canonical log. - As a precondition check at Gates G2.5 / G4.5 / G5: "does the log contain a
DECIDEDrecord for the required decision point of this Qx?" - When
final-method-explainer/solution-package-builder/paper-section-writerneed the method-selection narrative — they read it FROM this log withdecision_idprovenance, they do not re-author it. - After the contest, to render the decision log as post-mortem review material.
Do NOT use this skill:
- To author, suggest, or auto-fill any human decision or rationale.
- To decide which method is right, what a number means, or whether a result is good.
- To overwrite a past decision (the log is append-only; a changed mind appends with
supersedes).
Preconditions
- At least one
methods/Qx/decisions/<skill>_modeler_decision.mdartifact exists withstatus: DECIDED. - The Human Decision Artifact Convention in CLAUDE.md is the schema of record.
If a decision artifact is still PENDING (the human hasn't decided yet), do not log it — report it as "decision pending" and let the gate stay blocked.
Inputs
methods/Qx/decisions/*_modeler_decision.md— the per-skill human decision artifacts.- The candidate pool / experiment reports the decision references (for
evidenceprovenance). - The existing
methods/Qx/qx_decision_log.md, if any (to append to). planning/session_config.json(formode, recorded per record).
Workflow
Locate decided artifacts.
- For the target Qx, find every
methods/Qx/decisions/<skill>_modeler_decision.mdwithstatus: DECIDED. - Skip
PENDINGones; report them as still-blocking.
- For the target Qx, find every
Validate each decided artifact mechanically (do NOT re-judge the content).
decided_by: human; no sentinel survives in mandatory fields; over char floor.- Rationale is not a near-verbatim copy of
ai_suggestion(normalized-whitespace / tiny edit distance only — fuzzy similarity is a ledger WARN, not a reject; formulas andsymbol_table.mdsymbols are exempt). - Rationale references at least one concrete token from
evidence_refs. - If validation fails, do NOT log it — return it to the gate as a FAIL with the exact reason.
Append to the canonical log.
- Write one decision record per validated artifact into
methods/Qx/qx_decision_log.md. - Stamp
decision_id(monotonicQx-D01,Qx-D02, …),decision_point(from the fixed taxonomy below),timestamp, andcaptured_in_mode. - Append-only. If this decision revises an earlier one, set
supersedes: <prior decision_id>and append a new record; never edit the prior record.
- Write one decision record per validated artifact into
Update the global index.
- Maintain
planning/decision_log_index.md: one row per decision across all Qx (id, Qx, decision_point, choice, confidence, supersedes).
- Maintain
Provide the narrative source for downstream skills.
- When asked for the method-selection narrative, return the relevant
modeler_rationalefields verbatim/near-verbatim, each tagged with itsdecision_id, so the consuming skill can cite provenance (<!-- from Q1-D03 -->). - The AI consuming this is forbidden from writing a method-justification sentence that has no backing decision record.
- When asked for the method-selection narrative, return the relevant
Staleness check (mirrors frozen_numbers).
- If a
code/Qx/*change shifted a result that aresult_verdictrecord was based on, mark that recordSTALE — modeler must re-confirm verdict against new result. Do not silently keep a verdict the human signed against now-changed numbers.
- If a
Decision-point taxonomy (fixed)
A record's decision_point is exactly one of:
framing · method_choice · assumption_necessity · baseline · hyperparameter · result_verdict · confidence · claim_scope · figure_role
The four a judge actually grades are mandatory before "Ready for Writer": method_choice, result_verdict, confidence, assumption_necessity. The rest are optional. completeness-auditor checks the mandatory four exist per Qx.
Outputs
methods/Qx/qx_decision_log.md— the canonical append-only log for Qx.planning/decision_log_index.md— global roll-up across all Qx.
Output format
methods/Qx/qx_decision_log.md:
# Qx Decision Log
> Canonical, append-only record of the modeler's decisions for Qx.
> Downstream narratives transcribe from here with `decision_id` provenance. Never edited in place.
---
### Q2-D01 · method_choice · 2026-06-06T14:20+08:00 · mode: learning
- **options_considered**: M1 moving-average (baseline) / M2 ARIMA / M3 grey GM(1,1) <!-- AI-filled, neutral -->
- **evidence**: PoC feasibility — M2 RMSE 2.4 on first 50 months (methods/Q2/poc/m2_poc_result.txt) <!-- AI-filled -->
- **ai_suggestion**: M2 ARIMA — best PoC fit <!-- AI-filled, not a verdict -->
- **modeler_decision**: M2 (CHOSEN) <!-- HUMAN -->
- **modeler_rationale**: <human's ≥-floor words, cites a number from evidence> <!-- HUMAN -->
- **confidence**: medium
- **supersedes**: —
### Q2-D02 · result_verdict · 2026-06-07T09:10+08:00 · mode: speed
...
planning/decision_log_index.md:
# Decision Log Index (all subquestions)
| id | Qx | decision_point | choice | confidence | supersedes |
|----|----|----------------|--------|------------|------------|
| Q2-D01 | Q2 | method_choice | M2 | medium | — |
| Q2-D02 | Q2 | result_verdict | M2 CHOSEN | high | — |
Post-contest review
Because every record carries options_considered, evidence, the human's modeler_decision + modeler_rationale, and a supersedes chain, the index is a decision diary. After the contest it renders, per record: "At Q2-D01 you chose M2 over M3 because supersedes chains show where the team's thinking changed — the most useful thing for a student to revisit. This is the learning payoff: the artifact that gated the contest is the study guide afterward, with zero extra authoring.
Rules
- Never author, suggest, or auto-fill
modeler_decisionormodeler_rationale. Pre-fill onlydecision_id/options_considered/evidence/ai_suggestion. - Never decide which method is right, what a number means, or whether a result is good.
- Append-only. Never edit or delete a past record. A changed mind is a new record with
supersedes. - Do not log a
PENDINGartifact. Do not log an artifact that fails mechanical validation — return it to the gate. - Validation is mechanical only (existence / floor / sentinel / near-verbatim-copy / evidence-citation). Do not grade the modeling content.
- A downstream narrative sentence claiming "why we chose X" must trace to a
modeler_rationalein this log. If the human hasn't logged the why, the paper may not assert the why. - Keep the mandatory decision-point set minimal (the four graded ones) to avoid authoring burden; the rest are opt-in.
Verification
Before handing off, verify:
- Every
DECIDEDartifact for this Qx is reflected as exactly one record. - No
PENDINGartifact was logged. - The mandatory decision points (
method_choice,result_verdict,confidence,assumption_necessity) exist for any Qx marked Ready-for-Writer. - The global index matches the per-Qx logs.
- No record was edited in place; revisions used
supersedes. - No human field was authored by this skill.
Failure modes
Stop and report a blocker if:
- A required decision artifact is
PENDING(the human hasn't decided) — the gate stays blocked; this is not an error, it's the design. - A decision artifact fails mechanical validation (empty / sentinel / near-verbatim copy / no evidence citation) — return it to the modeler with the exact field.
- A downstream skill asks this logger to author a rationale — refuse.
- A
result_verdictrecord is stale (its underlying number changed) — flag for re-confirmation, do not silently keep it.
Handoff
- To
final-method-explainer/solution-package-builder/paper-section-writer: provide themodeler_rationalefields withdecision_idtags as the narrative source. - To
workflow-orchestrator: report which mandatory decision points are present vs missing per Qx (gate input). - To
consistency-auditor: the log is a cross-media source — every "why we chose X" sentence in the paper must trace to adecision_idhere.
Relationship to the conventions
This skill operationalizes the Human Decision Artifact Convention (CLAUDE.md) by giving the scattered per-skill decision artifacts one canonical, append-only home, with the same immutability discipline as the Frozen Numbers Convention: append-with-supersedes is the decision-side analogue of 解冻 → 修改 → 重冻结.