pivot-or-kill - SKILL.md Agent Skill

name: pivot-or-kill description: Use when 3 consecutive core experiments or optimizations fail to meet expectations — forces honest status assessment and escalates the pivot/downgrade/kill decision to the user

Pivot or Kill (Meta-Control Layer)

Overview

Repeated failure is signal, not noise. When 3 consecutive core experiments fail expectations, stop iterating and escalate. The agent does not get to decide whether to continue — the user does.

Core principle: Honesty over optimism. Escalate over iterate.

Violating the letter of this rule is violating the spirit of this rule.

The Iron Law

AFTER 3 CONSECUTIVE FAILURES, STOP EXPERIMENTING. ESCALATE TO USER.

No fourth attempt. No "quick fix." Escalate.

Trigger and Escalation Flow

digraph pivot_or_kill {
    rankdir=TB;
    detect [label="3 consecutive failures\ndetected" shape=doublecircle];
    summarize [label="Summarize status\n(what was tried, what failed,\ntrend direction)" shape=box];
    analyze [label="Analyze root cause" shape=box];
    present [label="Present 3 options\n(Pivot / Downgrade / Kill)" shape=box];
    decide [label="User chooses" shape=diamond];
    pivot [label="Pivot\nChange method direction" shape=box style=filled fillcolor="#d4edda"];
    downgrade [label="Downgrade\nLower venue target" shape=box style=filled fillcolor="#fff3cd"];
    kill [label="Kill\nTerminate and archive" shape=box style=filled fillcolor="#f8d7da"];

    detect -> summarize;
    summarize -> analyze;
    analyze -> present;
    present -> decide;
    decide -> pivot [label="pivot"];
    decide -> downgrade [label="downgrade"];
    decide -> kill [label="kill"];
}

Step 1 — Summarize Status Honestly

State plainly:

What was tried (each of the 3 attempts)
What failed (specific metrics, specific gaps)
Trend direction (improving, flat, or worsening)

Do NOT soften language. "Results declined across all three attempts" is correct. "Results showed some variability" is evasion.

Step 2 — Analyze Failure Root Cause

Classify the failure into one of these categories:

Category	Description	Signal
Method flaw	Core assumption is wrong	All variants fail similarly
Data issue	Insufficient, noisy, or wrong distribution	Results vary wildly or ceiling is low
Implementation bug	Code-level error	Should be caught by debugging, not this skill
Evaluation too strict	Unrealistic expectations	Baselines also fail to meet the bar

Be specific. "Something is off" is not a root cause.

Step 3 — Present Exactly Three Options

Present all three. No filtering, no pre-selecting.

a) Pivot — Change method direction.

Specify: what changes, what stays
Estimated additional time
Risk level (high/medium/low)

b) Downgrade — Lower venue target.

Specify: new target venue
What experiments can be cut
What story line changes

c) Kill — Terminate the project.

Archive all records and results
Document lessons learned
Free resources for other work

Step 4 — User Decides

THE AGENT CANNOT MAKE THIS DECISION. WAIT FOR USER.

Present the options. Stop. Do not nudge. Do not recommend. The user decides.

Red Flags — STOP

Attempting a 4th fix without user consultation
Downplaying failure severity ("close to working", "almost there")
Blaming infrastructure instead of method
Framing continued iteration as "one last thing"
Silently lowering success criteria instead of escalating

Rationalization Prevention

Excuse	Reality
"One more try might work"	You said that twice already. Escalate.
"I think I see the issue now"	You thought that before too. Escalate.
"Killing the project is too drastic"	Killing saves resources for better projects. It's responsible, not drastic.
"Let me just tune hyperparameters"	Tuning is not innovation. If 3 approaches failed, the problem is deeper.
"The baseline is just very strong"	That's useful information. Maybe this gap isn't solvable with your approach.

The Bottom Line

3 failures → stop → summarize → escalate → user decides

Honesty is not pessimism. Escalation is not failure. Continuing blindly is.