name: short-circuit-activation-discipline description: The rule for safely activating any short-circuit (regex / static-map / Haiku gate) that bypasses Sonnet. Before activating, list the entities the LLM extracts today and verify the deterministic path produces them at the SAME fidelity — or plan a hybrid Haiku-extraction step. Intent precision is not extraction precision. Use before activating any Tier 0/E3/E4/E5 gate or any LLM bypass. Triggers - activate short-circuit, bypass Sonnet, static map, regex gate, Tier 0, E3, E4, E5 activation.
Short-Circuit Activation Discipline
Short-circuits save cost by skipping Sonnet — but the LLM is often doing MORE than classifying intent. Before activating any bypass, you must account for every sub-task the LLM currently handles, not just the label it emits.
Source of truth: docs/plans/2026-05-09-recovery-hardening-plan.md + the CLAUDE.md "Short-circuit activation discipline" rule. Pairs with model-routing-for-ambiguity and prompt-eval-gate.
The Rule
For EVERY short-circuit candidate (regex fast-path, static-map reply, Haiku gate), list the entities the LLM extracts today and verify the deterministic path can produce them at the SAME fidelity — OR plan to call Haiku/Sonnet for just-that-extraction as a hybrid step.
Intent-classification precision is not extraction precision. Don't trust a "100% precision" claim without scoping which sub-task it covers.
Canonical Case — E4 add_reminder (2026-05-17)
The regex ^תזכיר[יה]\s+(?:לי\s+)?(.{2,300}) had 100% precision in a 20-row eyeball — every match was a real reminder request.
But Sonnet wasn't only classifying. It was ALSO doing Hebrew natural-language time expansion (מחר בעשר → ISO send_at) that parseReminderTime (index.ts) cannot do — it handles structured 17:00/ב-5 only.
Naively activating the short-circuit would either:
- Silently mis-fire times (ghost-reminder trust catastrophe), or
- Fall back to Sonnet anyway (defeating the cost saving).
Decision: E4 activation DEFERRED to E5 (Haiku-as-extractor handles intent + time expansion in one cheap call).
Why This Matters Most
A short-circuit that saves $0.01 but mis-fires a reminder time is a far worse loss than the token saved — a wrong time is a phantom-reminder trust failure. The whole tuple (intent + all extracted entities) must be correct, not just the label.
Activation Bars (Gate G1, per-intent, not global)
- ≤ 0.3% false-negative rate
- ≤ 1% false-positive rate
- ≥ 50 firings per intent
- ZERO canonical wrong-intent cases (the
תזכירי לי מה ...?shape is pinned intests/e5_corpus_pinning_test.ts)
E5 covers ~91% of Solo messages, so a 1% FN rate is a trust catastrophe at scale; the 0.3% bar holds the absolute false-negative count flat.
Checklist Before Activating Any Short-Circuit
- List every entity the current LLM path extracts for this intent (intent, body, time/ISO, target, amount, etc.).
- Prove the deterministic path produces each entity at the same fidelity — OR insert a Haiku/Sonnet extraction step for the gaps (hybrid).
- Measure per-intent FN/FP via offline replay (
prompt-eval-gate) over ≥ 50 firings. - Confirm zero canonical wrong-intent cases in the pinned corpus.
- Only then flip the activation flag; keep it reversible (
readBotFlag10s TTL).
Existing Gates That Passed This Bar
The Tier 0 gates (pure-ack, bedtime, quick-undo) were safe to activate because their deterministic output fully matches what the LLM would produce for those narrow shapes:
- Pure-ack (≤12 chars, no
?, no digits, no Hebrew action verb): the LLM's only contribution is a brief acknowledgment — the static map replicates it exactly. - Bedtime (
לילה טוב× 2 in window): the LLM emits a standard goodnight reply — the static reply is identical in effect. - Quick-undo ("תמחקי"/"בטלי" within 60s): the LLM would call
remove_last_action; the deterministic path does the same viagetLastBotAction.
add_reminder did not pass because time expansion is a non-trivial extraction gap that the deterministic path cannot close.
Shadow First, Then Activate
Run in shadow mode (fire-and-forget annotation, Sonnet still replies) long enough to measure real-world FN/FP against Sonnet's actual executor output. The offline replay harness (scripts/e5_offline_replay.py) is the reference tool. Only after the bars above are met should you flip the activation flag in bot_settings.