h-reason - SKILL.md Agent Skill

name: h-reason description: | Umbrella for FPF-style structured reasoning in a haft project. Carries the full reasoning palette in one place: framing, exploration, comparison, verification, notes, plus slideument patterns (NQD, Goldilocks, BLP, Scaling-Law Lens). Make sure to use this skill whenever the operator wants structured thinking but the workflow isn't pre-named — phrases like "давай подумаем", "помоги разобраться", "let's think this through", "structured approach", "apply FPF here", "FPF reasoning", "haft this" — or whenever a request is ambiguous between framing, exploration, and comparison. Also the manual entry point: /h-reason or /h-fpf alias. For sharp signals dedicated skills still fire (h-frame, h-diagnose, h-explore, h-compare, h-verify, h-status, h-note, h-onboard, h-spec-cover); binding choices use manual /h-decide; commissioning uses manual /h-commission. when_to_use: | Operator wants haft-discipline thinking but doesn't pre-select a specific workflow, OR explicitly types /h-reason. Specialized skills auto-fire on sharper signals; this umbrella catches the rest. argument-hint: "[reasoning topic — what to think about / what to figure out]" allowed-tools: Bash Read Grep Glob Agent Write Edit mcphafthaft_problem mcphafthaft_solution mcphafthaft_decision mcphafthaft_query mcphafthaft_note mcphafthaft_refresh mcphafthaft_spec_section

h-reason — FPF reasoning umbrella

You are running the haft umbrella reasoning workflow. This skill replaces the old narrow /h-fpf (still aliased — same skill) with a complete reasoning palette: framing, exploration, comparison, verification, notes, plus key patterns from the slideument that don't have dedicated skills (Goldilocks problem selection, NQD discipline, BLP, Scaling-Law Lens).

This is the manual entry point for FPF-style work when the operator doesn't pre-pick a specific skill. Specialized skills (h-frame, h-diagnose, h-explore, h-compare, h-verify) still auto-fire on sharp signals and you SHOULD prefer them when the signal is clear — they carry deeper procedures. Use this umbrella when:

The operator's signal is ambiguous between framing / exploration / comparison
The operator explicitly types /h-reason or /h-fpf
You want the whole palette of FPF wisdom (slideument patterns, FPF glossary) accessible in one place
The work spans multiple workflow steps (frame → explore → compare in one session)

The single most important rule: Description ≠ Work

This is the failure mode this umbrella is designed to NOT fall into.

When asked open-ended design questions, the default impulse is to produce a useful chat response — variants with weakest-links, a Pareto front, a comparison table — without going through the haft kernel. Stop. The visual shape of FPF output is not a substitute for invoking the kernel.

If you deliver an analysis without calling mcp__haft__* tools, the result is ephemeral: gone by tomorrow, no ProblemCard, no SolutionPortfolio, nothing to /h-verify in two weeks. The chat answer is wishlist, not work.

Concrete failure patterns to catch in yourself:

About to do this in chat...	Stop and do this instead
Present 3+ alternative approaches	Call `haft_problem(action="frame")` then `haft_solution(action="explore")`
Compare two approaches with trade-offs	Frame first if needed, then `haft_solution(action="compare")`
State what the "real problem" is	Call `haft_problem(action="frame")` and persist it
Suggest "remember this" inline	Call `haft_note`
Verify a past decision claim	Call `haft_decision(action="evidence")` and `haft_refresh`

Friction tradeoff (honest). Yes, calling kernel tools costs more in-the-moment than answering directly. The friction is the price for durability and measurability. Your job is not "best chat answer right now" — it is "leave the project with future-verifiable memory."

Self-check before long responses

Before sending a long response, ask:

Is this response presenting 3+ alternatives, a comparison, an analysis with recommendation, or framing what the problem is?
Did I call any mcp__haft__* tool in this turn?

If (1) = yes and (2) = no — STOP. Pick the right kernel action below and call it before responding.

Maintenance check (FPF B.3.4) — before reasoning

When entering this umbrella, look at the most recent kernel response for Refresh reminder: N days since last stale scan. If N > 30 — call mcp__haft__haft_refresh(action="scan") BEFORE doing reasoning work.

Reasoning on a stale graph is the same anti-pattern as reasoning on stale code: variants you generate may rediscover what was already decided, or contradict an already-decayed claim. Burn the 1-second scan first; reason against fresh state. If the scan finds nothing new — mention briefly, proceed.

Same discipline for drift detected on files touched in this session: re-baseline via haft_decision(action="baseline", ...) or surface the drift explicitly. Do not silently proceed past a drift warning.

Surfacing the reminder is the kernel's job; acting on it is the agent's job. See CLAUDE.md Critical Reminders.

Quick triage — what is the operator actually asking?

Before doing anything, read the operator's signal carefully and classify:

Signal	Workflow	Section below
Proposing a refactor/rewrite/redesign/migration without naming the problem	Framing	Framing
Concrete failure with unclear root cause (tests fail, X doesn't work)	Diagnosis	Diagnosis
Asking for options / variants / alternatives / "how could we"	Exploration	Exploration
Comparing 2+ approaches ("A or B", "which is better", "X vs Y")	Comparison	Comparison
Wants to commit to a specific variant (a binding choice)	Decision (manual)	Decision is manual
Wants to check if a past decision still holds	Verification	Verification
Wants to record a micro-decision with rationale	Note	Note
Wants situational awareness ("where are we", "what's stale")	Status	Call `haft_query(action="status")` directly
Repository has no `.haft/` yet	Onboarding	Delegate to `/h-onboard`
General FPF question ("what is FPF", "look up pattern E.9")	FPF spec lookup	Call `haft_query(action="fpf", query=...)`

If the signal is ambiguous (spans multiple workflows) — start with framing. Without a framed problem, exploration / comparison float.

Framing

Use when: a solution is proposed before the problem is stated, or scope/acceptance is unclear.

Procedure (compressed; full version in /h-frame):

Stabilize the signal. What's actually broken or wanted? In one sentence, what condition would make the operator say "solved"?
Type the problem — pick one:
- diagnosis — something broke, root cause unknown
- optimization — known target, want better numbers
- search — looking for an unknown thing (e.g., a library, a pattern)
- synthesis — building something new
Umbrella-word repair — if the signal uses overloaded terms ("service", "quality", "scalable", "готово"), unpack to specifics. Refuse to record a ProblemCard built on umbrella words.
Acceptance criterion — one observable condition that signals "minimum viable success". Without this, the problem can't be verified later.

Record:

mcp__haft__haft_problem(
  action="frame",
  problem_type="<diagnosis|optimization|search|synthesis>",
  title="<short title>",
  signal="<what's happening / what's needed>",
  acceptance="<observable condition>",
  constraints=["<hard limit>"],
  blast_radius="<what gets affected>",
  reversibility="<low|medium|high>",
  mode="<tactical|standard|deep>"
)

Surface the ProblemCard ID to the operator. They can edit, supersede, or move on.

Common mistakes:

Framing during code-grinding sessions (h-reason isn't for every task — code work doesn't need a ProblemCard)
Letting acceptance stay vague ("works better") instead of observable
Skipping problem-type → comparisons later anchor wrong

For deeper framing (B.4.1 + B.5.2 with parallel-rival generation on diagnosis problems), use /h-frame or /h-diagnose directly.

Diagnosis (failure investigation)

Use when: concrete failure, unclear cause. Stronger than framing — runs parallel rival-hypothesis testing.

Compressed procedure (full version in /h-diagnose):

Stabilize — what's the symptom in one sentence? When did it start? Reproducible?
Frame as problem_type="diagnosis" via haft_problem(action="frame").
Generate ≥3 rival hypotheses (FPF B.5.2 four-step abductive). Optionally spawn parallel Agent subagents — each takes one hypothesis and tests it independently against the codebase. Sequential is fine for lightweight investigations.
Filter for falsifiability — drop hypotheses with no testable prediction.
Rank by evidence weight. Keep losing rivals visible (CC-B.5.2-2) — do not collapse to a single answer prematurely.
Record findings as haft_solution(action="explore", variants=[...]) where each variant = one hypothesis + its evidence + its weakest_link.
Recommend next action. Usually: test the leading hypothesis, then /h-decide if a fix path is clear.

For full parallel-subagent dance use /h-diagnose directly.

Exploration (NQD variants)

Use when: problem is framed, operator wants 3-5 distinct candidate solutions.

Compressed procedure (full version in /h-explore):

Confirm there is a framed problem. If not, frame first (above). Without a frame, variants float.
Generate 3-5 distinct variants — each must differ in KIND, not just degree (FPF EXP-08). Parallel directions to consider:
- Data-flow restructure (avoid the operation)
- Algorithmic alternative (same op, different algo)
- Infrastructure swap (different runtime / service)
- Caching / batching / queuing (smooth load)
- Architectural extraction (move to different layer)
- Workflow restructure (change when/how triggered)
- Stepping-stone (suboptimal now, opens future search space)
Each variant carries:
- title
- description (2-3 sentences)
- novelty_marker — what makes this distinct from typical AI suggestions
- weakest_link — what bounds quality if pursued (NOT title repeated)
- stepping_stone: true|false (+ basis if true)
- risks + strengths

Record:

mcp__haft__haft_solution(
  action="explore",
  problem_ref="<prob-...>",
  variants=[...],
  no_stepping_stone_rationale="<if all stepping_stone=false>"
)

Kernel returns SolutionPortfolio ID + may emit soft warnings about disguised duplicates, missing parity_rules, or weakest_links that just repeat titles. Read and self-correct.
Recommend next step (usually /h-compare if 3+ variants).

For full parallel-subagent generation use /h-explore directly — it spawns one Agent per direction in parallel.

Comparison (parity + Pareto)

Use when: SolutionPortfolio exists with 2+ variants, operator wants to evaluate.

Compressed procedure (full version in /h-compare):

Characterize first. Declare comparison dimensions BEFORE scoring (kernel rejects subjective dimensions or stale ones):

mcp__haft__haft_problem(
  action="characterize",
  problem_ref="<prob-...>",
  dimensions=[
    {"name": "latency", "scale_type": "ratio", "unit": "ms", "polarity": "lower_better", "role": "target"},
    {"name": "ops_complexity", "scale_type": "ordinal", "unit": "1-5", "polarity": "lower_better", "role": "target"},
    {"name": "compliance_OK", "scale_type": "binary", "polarity": "true_better", "role": "constraint"},
    ...
  ],
  valid_until="<ISO date>"
)

Tag each dimension's indicator role: constraint (hard limit), target (optimize), or observation (Anti-Goodhart — watch but don't optimize).
Declare parity plan + selection policy BEFORE scoring. Equal budgets/windows for all variants. State the dominance rule (Pareto front, not scalar winner).
Score dim-wise — one evaluator per dimension applies one scale to all variants. Prevents anchoring bias.
Compute non-dominated set. Return Pareto front (NOT a scalar winner).

Record:

mcp__haft__haft_solution(
  action="compare",
  portfolio_ref="<sol-...>",
  parity_plan={...},
  selection_policy={...},
  results={"dimensions": [...], "scores": {...}}
)

Decision is manual. See next section.

For full dim-wise parallel scoring use /h-compare directly — it spawns one Agent per dimension.

Decision is manual

You CANNOT auto-fire /h-decide. It is disable-model-invocation: true per Transformer Mandate.

When the operator is ready to commit to a chosen variant from a SolutionPortfolio:

Surface the Pareto front summary
Recommend the variant you would pick (with rationale)
Tell the operator: «this needs your explicit /h-decide to bind»
Stop and wait

If they say "go ahead" without typing /h-decide — still stop. Ask them to type it. The principle is: agents generate options, humans bind.

Verification

Use when: a recorded DecisionRecord needs post-implementation reality check.

Compressed procedure (full version in /h-verify):

Find the decision via haft_query(action="decisions", filter="active") or specific ref.
Read predictions from the DRR — what was supposed to be true?
Gather evidence — run tests, check code, measure metrics, search docs.

Record:

mcp__haft__haft_decision(
  action="evidence",
  decision_ref="<dec-...>",
  evidence_items=[{
    "type": "<test|measurement|incident|review>",
    "verdict": "<supports|refutes|weakens>",
    "summary": "...",
    "carrier_ref": "<file:line or URL or test name>",
    "cl": "<CL3|CL2|CL1|CL0>"  // congruence level
  }]
)

After enough evidence: refresh the decision status:
- accepted (predictions held)
- weakened (mixed)
- superseded (replace with new decision)
```
mcp__haft__haft_refresh(action="<waive|supersede|deprecate>", artifact_ref="<dec-...>")
```

For full verify loop (drift detection across code/affected_files, FSRS-style intervals) use /h-verify directly.

Note (micro-decision)

Use when: small choice with rationale that doesn't justify full DRR ceremony.

mcp__haft__haft_note(
  text="<one or two sentences: what + why>",
  rationale="<the why is required — kernel rejects content-free notes>",
  links={
    "based_on": ["<prob-... or dec-...>"],
    "tags": [...]
  }
)

Common moments to note:

After resolving a non-trivial bug — capture cause + fix lesson
After ruling out a variant during exploration — capture why
After a user correction — capture the right approach
A surprising discovery in the codebase
A constraint discovered mid-work

Slideument patterns (Levenchuk's 2026 seminar wisdom)

These don't have dedicated skills but you should apply them when relevant:

Three factories. Engineering has three production lines:

Problem factory — generates new framed problems (problematization)
Solution factory — generates solutions for framed problems
Factory of factories — improves both production lines

In haft: ProblemCard archive = problem factory output; SolutionPortfolio = solution factory output. Bring this lens when the operator confuses "what problem to solve" with "how to solve it" — they're separate factories.

Goldilocks problem selection. Pick problems just above current capability (LLM heuristic: solvable in 10-20% of runs, not less, not more). Pre-pick checks:

Measurable acceptance (not vague)
Reversibility (can we undo?)
Stepping-stone potential (does solving open new search space?)
Multi-axis trade-off (otherwise not really a Pareto problem)
Valid_until on framing (postings rot)

Apply: when triaging which problem to work on next, surface these criteria explicitly.

NQD discipline (Novelty / Quality / Diversity).

Novelty — how different from known solutions
Quality — measurable use-value in acceptance spec
Diversity — coverage of independent niches by the portfolio

Never collapse NQD into a scalar. Hold a Pareto front. Keep 1-2 stepping-stones (high N or D, lower Q now) for future search-space expansion. Apply: in exploration, score each variant on NQD axes, not just utility.

BLP — Bitter Lesson Preference. When choosing between hand-tuned specialist and scalable universal agent, prefer the latter as tie-breaker. Apply: in comparisons that have tied Q-scores, BLP can break ties toward general-purpose / learnable approaches.

Scaling-Law Lens (SLL). When claiming "this approach scales", declare:

Scale variables (compute / data / capacity / iteration budget)
ScaleWindow (range over which claim holds)
ElasticityClass (rising / knee / flat / declining)

Don't say "X scales" without naming what S is, what window, what slope. Apply: in AI/agent-related comparisons especially.

Stepping stones (Lehman & Stanley 2015). Sub-optimal-by-Q solutions that open new search areas. Keep 1-2 in portfolio explicitly. Apply: in exploration, mark them; in comparison, don't drop them based on current Q alone.

Anti-Goodhart via Indicator Roles. Tag each dimension:

constraint — hard limit, must satisfy
target — what you optimize
observation — watch but DON'T optimize

If you optimize an observation, you're Goodharting. Apply: at characterization time, before any scoring.

FPF glossary (key concepts)

R_eff (Effective Reliability): computed trust score in [0,1]. R_eff = min(evidence_scores) with CL penalties. Never average — weakest-link principle.

CL (Congruence Level): how well evidence transfers across contexts.

CL3: same context (internal test) — no penalty
CL2: similar context (related project) — 0.1 penalty
CL1: different context (external docs) — 0.4 penalty
CL0: opposed context — 0.9 penalty

Evidence Decay: evidence has valid_until. Expired evidence scores 0.1 (weak, not absent). Graduated epistemic debt sorted by severity.

DRR (Decision Record): FPF E.9 four-component structure: Problem Frame, Decision/Contract, Rationale, Consequences. Created ONLY via manual /h-decide.

Indicator Roles: see Anti-Goodhart above.

Transformer Mandate: systems cannot transform themselves. Humans bind; agents document. Autonomous binding = protocol violation.

Strict Distinction (A.7): Object ≠ Description ≠ Carrier. Plan ≠ Reality. Design-time ≠ Run-time. Promise/commitment ≠ actual delivery.

Three systems always present:

Target system — what delivers value to users
Enabling system — what creates / governs the target (team + tooling + process)
Method — the way of creating it

Confusing these is the most common FPF anti-pattern.

When to delegate to a specialized skill

This umbrella covers compressed versions of the workflows. For deeper procedures (parallel subagent dancing, full parity discipline, complete drift detection), recommend the specialized skill:

Heavy version	When to delegate
`/h-frame`	Standard/deep mode with full B.4.1 stabilize-route + problem-typing
`/h-diagnose`	Need parallel rival hypothesis testing across the codebase
`/h-explore`	Need parallel Agent-per-direction NQD variant generation
`/h-compare`	Need parallel dim-wise scoring (one evaluator per dimension)
`/h-verify`	Full drift detection + Evidence Decay scan + FSRS-style scheduling
`/h-decide` (manual)	Always — binding action
`/h-commission` (manual)	Always — execution authority grant
`/h-onboard`	No `.haft/` directory yet
`/h-status`	Cheap dashboard, can call directly
`/h-spec-cover`	Module-level coverage analysis

For routing on natural language signal, the specialized skill descriptions ALSO auto-trigger — so if the signal is sharp, you may never enter this umbrella. That's fine: both routes converge on the same kernel.

What NOT to do in this umbrella

DO NOT auto-fire /h-decide or /h-commission. Transformer Mandate. Even if the user says "decide for me" — stop and ask for explicit slash command.
DO NOT skip haft_problem(action="frame") before exploration. Without framing, variants are wishlist.
DO NOT collapse NQD into a scalar. Always present Pareto front.
DO NOT use FPF jargon in your responses to the operator unless they used it first. Translate R_eff to "trust score", CL2 to "different project context", Evidence Decay to "stale evidence".
DO NOT replicate full procedures from specialized skills here. This umbrella's job is compressed coverage. For thoroughness — delegate.
DO NOT skip the kernel — even compressed procedures must persist via mcp__haft__* calls.

FPF spec lookup

For deep references on any concept:

mcp__haft__haft_query(action="fpf", query="<topic or pattern-id>")

Add full=true for complete pattern text, explain=true for guided explanation.

Common pattern IDs:

A.1 — Holonic Foundation (target vs enabling system)
A.7 — Strict Distinction (Clarity Lattice)
A.17 / A.18 — Characteristic Norm (dimension declaration)
B.3 — Trust & Assurance (R_eff, CL)
B.3.4 — Evidence Decay
B.4.1 — Observe → Notice → Stabilize → Route
B.5.2 — Abductive Loop (four-step rival generation)
B.5.2.1 — NQD-style Creative Abduction
C.18 — NQD-CAL (Open-Ended Search Calculus)
C.18.1 — Scaling-Law Lens
C.19 — Explore/Exploit Governor + BLP
E.9 — Design-Rationale Record method (DRR)
E.16 — Autonomy Budget
F.18 — Local-First Unification Naming