inversion-mental-model - SKILL.md Agent Skill

name: "inversion-mental-model" description: "Use this skill when the agent needs a stronger way to reason about risks, failure modes, blind spots, or strategy."

Skill: Inversion for AI Agents

Purpose

Use this skill when the agent needs a stronger way to reason about risks, failure modes, blind spots, or strategy.

Inversion means: instead of asking only,

“How do I succeed?” also ask,
“How could this fail?”
“What would make this worse?”
“What mistakes would create the opposite result?”
“What conditions would destroy the goal?”

This often reveals problems that forward reasoning misses.

Core Rule

Before recommending a path to success, first model the main paths to failure.

Inversion is especially powerful when:

the objective is vague
risks are hidden
systems are complex
optimism bias is strong
the agent is tempted to propose a shiny plan too quickly

When to Use

Use this skill when:

designing strategy
reviewing plans
evaluating systems
doing risk analysis
making recommendations under uncertainty
stress-testing decisions
planning launches, migrations, rollouts, or workflows
improving reliability, safety, quality, or trust

Inversion Questions

For any goal, ask:

Goal inversion

What would make us fail at this goal?
What would the opposite of success look like?

Risk inversion

What mistakes are most likely?
What assumptions, if false, would break the plan?

System inversion

What interactions could amplify damage?
What hidden dependencies make this fragile?

Human inversion

Where would confusion, overload, delay, or neglect appear?
What would a rushed or careless operator likely get wrong?

Process inversion

What shortcuts would create the worst outcomes?
What checks are missing?

Standard Inversion Workflow

Step 1: Define the goal clearly

Example:

make this process reliable
choose a good vendor
improve an onboarding flow
make this agent trustworthy
ship a successful launch

Step 2: State the opposite

Ask:

how would we systematically fail?
what would sabotage this goal?
what would create the worst realistic version of the outcome?

Step 3: Generate failure modes

Create a list of:

direct failures
indirect failures
delayed failures
operator failures
coordination failures
measurement failures
incentive failures

Step 4: Prioritize the failure modes

Rank by:

likelihood
severity
detectability
reversibility

Do not treat all risks equally.

Step 5: Convert failures into guardrails

For each high-value failure mode, define:

prevention
detection
containment
rollback or recovery

This is the real output of inversion.

Example Pattern

Goal: Make this AI workflow trustworthy.

Inversion: How would I make it untrustworthy?

Possible answers:

let it act with weak verification
hide uncertainty
let it use stale facts
allow silent assumption jumps
optimize speed over evidence on high-risk tasks
give it unclear tool boundaries

Guardrails:

ETTO preflight
explicit uncertainty
evidence thresholds
tool-scope limits
validation steps
rollback rules

Best Use Cases

safety reviews
launch planning
process design
evaluation design
incident prevention
model behavior guardrails
product strategy
reliability planning
hiring/organization decisions
personal decision support

Failure Modes This Skill Prevents

1) Pure optimism

The agent only plans for the happy path.

Counter: Model concrete ways to fail first.

2) Hidden fragility

The agent recommends a plan without stress-testing assumptions.

Counter: Ask what breaks the plan.

3) Shallow risk analysis

The agent names generic risks but does not operationalize them.

Counter: Convert failure modes into guardrails and detection signals.

4) Forward-only reasoning

The agent misses obvious negatives because it only thinks about how to succeed.

Counter: Use the opposite-outcome lens.

Prompt Snippets

For strategy

“Use inversion. Do not start by asking how to achieve the goal. First ask how we would fail, what would sabotage success, and what guardrails should block those paths.”

For safety

“Invert the objective and identify the most likely and most damaging ways this could go wrong. Then turn those into prevention and detection controls.”

For planning

“Stress-test this plan by reasoning backward from failure.”

For AI behavior

“List the ways an agent like this would become unreliable, manipulative, brittle, or unsafe, then design the operating rules that prevent that.”

Definition of Done

Inversion was applied correctly when:

the goal was clearly defined
meaningful failure modes were generated
major risks were prioritized
failure modes were translated into controls
the final recommendation was stronger because it was stress-tested

Final Instruction

Before you ask how to win, ask how you lose.

Then remove the losing paths.