mental-models - SKILL.md Agent Skill

name: mental-models description: > Applies mental models to decisions, tradeoffs, failures, biases, complexity, epistemology. Use when: think through, trade-offs/pros/cons, should I, analyze decision, missing angles, steel man, devil's advocate, debug this, root cause, five whys, post/pre-mortem, overthinking, analysis paralysis, what went wrong, why did this fail, what could go wrong. Triggers: too complex, over-engineering, keep it simple/KISS, grug brain, ponytail, premature abstraction, clever code, one/two-way door, opportunity cost, exploit vs explore, decision paralysis, bottleneck, feedback loop, leverage point, emergent, compounding, technical debt, bias, sunk cost, anchoring, confirmation bias, Dunning-Kruger, impostor syndrome, incentives, Goodhart, Chesterton's fence, antifragile, margin of safety, hidden assumptions, framing, bad faith, first principles, defense in depth, wrong framework, circle of competence, scout mindset, second order effects, unintended consequences, gestalt principles. Do NOT use for implementation tasks.

Select and apply the most relevant mental models from this toolkit. For models with full protocols, read the linked file before applying.

Decision Making

Exploit-Explore Tradeoff — Every decision allocates between exploit (refine what works) and explore (try something new). Pure exploit converges to a local maximum; pure explore never compounds. Calibrate the ratio: short time horizon or high stakes → exploit; diminishing returns or changing environment → explore. Maps to grind vs paradigm in recursive self-improvement. See exploit-explore.md.

Inversion — Instead of asking "how do I succeed?", ask "what would guarantee failure?" and avoid those. Unblocks stuck thinking by reversing the question. See inversion.md.

Second-Order Thinking — Ask "and then what?" at least three times. First-order effects are obvious; second and third-order effects are where plans fail. See second-order-thinking.md.

Reversibility — Is this a two-way door (easily reversible) or a one-way door (permanent)? Two-way: move fast. One-way: invest in getting it right. Most decisions are two-way doors treated as one-way. See reversibility.md.

Opportunity Cost — Every yes is a no to something else. What are you NOT doing by choosing this? The hidden cost of any choice is the best alternative foregone. See opportunity-cost.md.

Pre-Mortem — Before starting, imagine the project failed. What went wrong? Fix those things now. More useful than post-mortems because you can still act. See pre-mortem.md.

Dimensionalize — Transform complex decisions into 3-7 measurable dials scored on Fidelity (is it real?), Leverage (can you twist it, does twisting matter?), and Complexity (can you hold it in your head?). Drop anything vague, uncontrollable, or redundant. Read dimensionalize.md for full protocol — this has a specific scoring process.

Bayesian Thinking — Update beliefs proportionally to evidence strength. Start with a prior, observe data, update. Strong priors need strong evidence to move. Weak evidence shouldn't flip strong beliefs. See bayesian-thinking.md.

Systems Thinking

Bottleneck Analysis — The system is only as fast as its slowest component. Find the constraint before optimizing anything else. Optimizing non-bottlenecks is waste. See bottleneck-analysis.md.

Feedback Loops — Identify reinforcing loops (growth/collapse spirals) and balancing loops (stability). Most bugs in complex systems are feedback loops you didn't see. See feedback-loops.md.

Map vs. Territory — Your model of the system is not the system. When model disagrees with reality, update the model, not your perception of reality. See map-vs-territory.md.

Emergence — Simple rules produce complex behavior. Look for the simple rules underneath complex systems before adding complexity to your solution. See emergence.md.

Leverage Points — Places in a system where small changes produce big effects. Meadows' hierarchy: parameters < buffers < feedback loops < information flows < rules < goals < paradigms. Intervene at the highest leverage point you can reach. See leverage-points.md.

Recursive Self-Improvement — When a system improves itself, distinguish grind (80%, systematic iteration within current paradigm) from paradigm shift (20%, entirely new approaches). Small improvements compound multiplicatively (multiplier effect), but hit diminishing returns ("adding nines"). Agents excel at grind; paradigm decisions need human judgment. See recursive-self-improvement.md.

Compounding Change Over Time — Small persistent changes accumulate nonlinearly. A 1% daily improvement compounds to 37x over a year; a 1% daily decline compounds to near zero. Applies beyond finance: technology adoption, skill development, institutional drift, technical debt. People systematically underestimate slow change because each increment is invisible. Cross-references the multiplier math in recursive-self-improvement.md. Source: Tyler Cowen / Marginal Revolution. See compounding-change.md.

Problem Solving

First Principles — Decompose to fundamental truths. Reason up from there instead of reasoning by analogy. Breaks through "we've always done it this way." See first-principles.md.

Occam's Razor — The simplest explanation that fits the facts is usually correct. Don't add complexity until simple explanations fail. In debugging: the boring explanation is almost always right. See occams-razor.md.

Pareto Principle — 80% of the effect comes from 20% of the causes. Find the 20% before optimizing the rest. Applies to bugs, features, customers, and effort. See pareto-principle.md.

Five Whys — When something breaks, ask "why?" five times. Each answer becomes the subject of the next question. Stops you from fixing symptoms. See five-whys.md.

Post-Mortem — After a failure, structured diagnosis before retrying. State what happened, what you expected, trace root cause, classify the failure (wrong assumption, wrong approach, scope error, flaky), then plan a fix based on the classification. Prevents blind retry loops. See post-mortem.md.

OODA Loop — Observe → Orient → Decide → Act, then repeat faster than the environment changes. Speed of the loop matters more than perfection of any step. See ooda-loop.md.

Critical Analysis

These are heavy analytical protocols. Read the linked files before applying — each has a specific multi-step process with output schemas that Claude should follow.

Antithesize — Generate standalone opposition to any proposition. Not refutation — a complete alternative worldview that stands on its own, accepts the same facts, and reaches opposite conclusions. Menu of 14 antithesis types (refutation, rival thesis, objective flip, axis shift, causal inversion, etc.) selected by purpose: falsify, replace, robustify, clarify values, or reframe. Read antithesize.md for full protocol.

Excavate — Assumption archaeology. Recursively ask "what must be true for this to make sense?" and tag each assumption: empirical, normative, structural, psychological, or definitional. Surfaces the cruxes where disagreement actually lives. Read excavate.md for full protocol.

Negspace — Detect what's conspicuously absent. Given the statistical structure of a text, what argument should be there but isn't? Classify each omission: vulnerability (ego protection), upside (ambition withheld), bedrock (unstated axioms), blind spot (invisible to author), or optionality (strategic non-commitment). Read negspace.md for full protocol.

Rhetoricize — Separate substance from spin. Extract facts into a ledger, then perturb framing (swap connotations, shift voice, toggle modality) while holding facts constant. The fulcrum is the word or grammatical move that most changes how the argument lands. Surprise = affect_shift × meaning_overlap × fluency. Read rhetoricize.md for full protocol.

Handlize — Extract executable residue from dense arguments. For each concept, test: would it change a decision (actionability)? Can it be measured (operationalizability)? Is it genuinely new (novelty)? Classify as live / burned / dead. Null output is honest — not everything contains handles. Read handlize.md for full protocol.

Inductify — Extract non-obvious structural commonalities across multiple examples (n≥2). Decompose each case into mechanisms, assumptions, values, constraints. Cross-reference for structural isomorphisms. Each pattern must specify mechanism, predictive claim, and breaking condition. Read inductify.md for full protocol.

Synthesis & Mapping

These protocols integrate or transfer knowledge across domains. Read the linked files — each has specific processes.

Synthesize — Compress conflicting positions into a decision-sufficient framework. Not "both sides have a point" — a NEW structure that explains why both positions seemed true from their angles, makes novel predictions neither makes alone, and tracks what was simplified (drop-log). Produces tiered outputs: quick (50w), medium (150w), deep (300w+). Read synthesize.md for full protocol.

Rhyme — Fast structural similarity detection. Maps novel inputs onto known patterns through echo recognition. The pre-analytical step before deep mapping. Generate 3-5 candidates, score on parallel density, source maturity, transfer leverage. Quality threshold: ≥0.6 on key dimensions. Read rhyme.md for full protocol.

Metaphorize — Build explicit, high-coverage mapping from source domain to target domain. Heavier than rhyme, lighter than formal proof. When source has math, carry the math with units and dimensional analysis. Produces mapping table, formula shelf, invariant assertions, and metric plan. Read metaphorize.md for full protocol.

Action & Rigor

Swiss Cheese Model (James Reason) — Every defense layer has holes. Disasters happen when holes in multiple layers align, allowing a hazard to pass through every defense. Safety comes from layering imperfect defenses so their holes don't overlap. When you find a hole, add another slice of cheese — don't just label the hole. See swiss-cheese-model.md.

Countermeasure, Not Caveat — When you identify a weakness in your approach, convert it into an action, not a footnote. A caveat in the report acknowledges a gap exists. A countermeasure in the methodology closes the gap. These feel the same — both demonstrate understanding — but only one changes the outcome. Extends the Swiss Cheese Model: finding a hole is the observation; adding a layer is the countermeasure; documenting the hole without adding a layer is the caveat trap. Read countermeasure-not-caveat.md for full protocol — this has a specific checkpoint process.

Cognitive Biases & Traps

These require recognition, not protocol. Knowing the name is usually enough to catch yourself.

Confirmation Bias — Tendency to seek, interpret, and remember information confirming existing beliefs. Counter: actively seek disconfirming evidence. Ask "what would change my mind?" before analyzing. See confirmation-bias.md.

Sunk Cost Fallacy — Continuing an endeavor because of past investment (time, money, effort) rather than future value. The money is already spent regardless. Decision: "knowing what I know now, would I start this?" See sunk-cost-fallacy.md.

Anchoring — Over-relying on the first piece of information encountered. First numbers, first impressions, first estimates disproportionately influence subsequent judgments. Counter: generate your own estimate before looking at others'. See anchoring.md.

Dunning-Kruger Effect — Unskilled individuals overestimate their ability; skilled individuals underestimate theirs. The less you know, the less you know about how much you don't know. See dunning-kruger.md.

Goodhart's Law — When a measure becomes a target, it ceases to be a good measure. People optimize the metric, not the underlying goal. Every KPI eventually gets gamed. Counter: use counter-metrics and measure what you refuse to sacrifice. See goodharts-law.md.

Never Reason from a Price Change — A price change is an outcome, not a cause. "Oil prices rose, so consumers will spend less" skips the crucial question: why did the price change? A supply shock and a demand surge produce the same price movement but opposite downstream effects. Always identify the cause first, then reason from the cause. Source: Scott Sumner, The Money Illusion. See never-reason-from-price-change.md.

Universal Love, Said the Cactus Person (Scott Alexander) — Some problems can't be solved from within the framework that generated them. When you keep pressing different dashboard buttons and nothing works, the answer may be "get out of the car." The demand for instrumental proof, the need to optimize and measure — sometimes these ARE the obstacle, not the path to the solution. The Car Test: if every new attempt is a variation on the same method, you may need to step outside the framework entirely. See cactus-person.md.

Strategic Thinking

Chesterton's Fence — Before removing something, understand why it was put there. If you don't understand its purpose, you don't understand the consequences of removing it. Applies to code, processes, and institutions. See chestertons-fence.md.

Circle of Competence — Know the boundary of what you actually understand vs. what you think you understand. Operating inside your circle: genuine knowledge. Outside: overconfidence. The skill is knowing where the edge is. See circle-of-competence.md.

Antifragility — Some systems don't just survive stress — they get stronger from it. Fragile breaks under volatility; robust survives; antifragile improves. Design for antifragility: small reversible bets, option-rich positions, barbell strategy. See antifragility.md.

Margin of Safety — Build in buffer for the unknown. Engineers over-spec bridges; investors buy below intrinsic value. The margin between your estimate and disaster is your insurance against being wrong. See margin-of-safety.md.

Lindy Effect — The longer a non-perishable thing has survived, the longer its expected remaining lifespan. A book in print for 50 years will likely be in print for 50 more. Prefer battle-tested over novel when the cost of failure is high. See lindy-effect.md.

Coordination & Incentives

Tragedy of the Commons — Individual rational behavior depleting shared resources. Each actor takes more because the cost is distributed. Solutions: privatize, regulate, or create social norms that make defection costly. See tragedy-of-the-commons.md.

Principal-Agent Problem — When one party (agent) acts on behalf of another (principal) but has different incentives. The agent optimizes for themselves, not the principal. Solutions: align incentives, monitor, or reduce information asymmetry. See principal-agent.md.

Hanlon's Razor — Never attribute to malice what can be adequately explained by ignorance, incompetence, or misaligned incentives. Most failures are systemic, not conspiratorial. See hanlons-razor.md.

Forcing Function — A constraint that forces confrontation with an issue. Deadlines, budgets, public commitments, and ship dates are forcing functions. Without them, decisions defer indefinitely. See forcing-function.md.

Design & Communication

Pattern Language (Christopher Alexander) — Solutions to recurring problems exist as interconnected patterns. A pattern has: context, problem, forces in tension, solution, consequences. Good architecture is a network of patterns that reinforce each other. See pattern-language.md.

Small Multiples (Edward Tufte) — Show the same structure repeated with one variable changed. Enables comparison without cognitive load. Works for charts, UI states, code examples, test cases. See small-multiples.md.

Gestalt Principles (Wertheimer / Koffka) — The mind perceives structure before it reads content. Proximity, similarity, common region, continuity, closure, common fate, and figure/ground decide what looks grouped — independent of the words. Arrange elements so layout carries the structure (whitespace groups more cheaply than lines) and the reader gets it for free; fight these laws and no prose rescues it. Overarching law: Prägnanz — people see the simplest available interpretation. See gestalt-principles.md.

Scout Mindset (Julia Galef) — Your goal is to see what's actually there, not to build a case for what you want to see. Treat being wrong as an update, not a failure. The measure of good reasoning is calibration, not confidence. See scout-mindset.md.

Hoare's Dictum — Two ways to build software: so simple there are obviously no deficiencies, or so complicated there are no obvious deficiencies. The first is far harder — and far better. Prefer designs you can prove correct by inspection over designs too complex to find bugs in. See hoares-dictum.md.

The Grug Brained Developer (Carson Gross) — Complexity is the eternal enemy — a "spirit demon" that invades codebases through well-meaning developers. Fight it by saying no, delaying abstractions until natural cut points emerge, preferring integration tests, keeping refactors small, logging generously, and openly admitting when something is too complex (FOLD). Apply 80/20 to features and architecture. When "big brain" solutions are proposed, ask whether the simpler version would actually work. See grug-brain.md.

Ponytail (The Lazy Senior Dev) (Dietrich Gebert) — The laziest solution that actually works. Before writing code, climb a ladder and stop at the first rung that holds: does it need to exist (YAGNI)? already in the codebase? stdlib? native platform feature? installed dependency? one line? — only then write the minimum that works. The ladder runs after understanding the problem, never instead of it; a bug fix targets the shared root cause, not the symptom the ticket names. Never lazy about comprehension, validation, security, accessibility, or leaving one runnable check. Levels: lite / full / ultra. See ponytail.md.

Kirby Frame (Anthony Moser) — Instead of negating a bad-faith argument (which reinforces it), absorb it into a larger frame that explains what the speaker is actually doing. Named after Kirby, who inhales enemies. Don't rebut the claim — expose the rhetorical move. See kirby-frame.md.

Slop Grenade — Pasting a wall of AI-generated text where a human would write one sentence. Even when correct, the format is hostile: it steals the reader's time, kills the dialogue (nothing to reply to), and degrades the medium itself. Match the medium, lead with the answer, and use AI to make things clearer — not longer. See slop-grenade.md.

Quick Lookup

A model almost always fits non-trivial problems — scan for one. Multiple models often apply simultaneously; combine as needed.

Stuck? → Inversion, First Principles
Meta-work? → Recursive Self-Improvement, Exploit-Explore, Leverage Points
Debugging? → Five Whys, Post-Mortem, Bottleneck Analysis
Something failed? → Post-Mortem (diagnose before retrying)
Deciding? → Second-Order Thinking, Reversibility, Opportunity Cost
Slow trend? → Compounding Change, Recursive Self-Improvement
Price moved? → Never Reason from a Price Change
Known vs new approach? → Exploit-Explore, Reversibility
Writing a limitation? → Countermeasure, Not Caveat, Swiss Cheese Model
Checking robustness? → Swiss Cheese Model
Analyzing an argument? → Excavate, Negspace, Rhetoricize
Comparing options? → Dimensionalize
Conflicting views? → Synthesize, Antithesize
Bad-faith argument? → Kirby Frame, Rhetoricize
Dumping AI text at someone? → Slop Grenade
Too complex? → Ponytail, Grug Brain, Hoare's Dictum, Occam's Razor
Over-engineering? → Ponytail, Grug Brain, First Principles, Pareto
Laziest solution that works? → Ponytail (the ladder), Grug Brain (80/20)
Organizing a page, UI, or diagram? → Gestalt Principles, Small Multiples
Analysis paralysis? → Cactus Person, Reversibility, Exploit-Explore
Wrong framework? → Cactus Person, Map vs. Territory, Circle of Competence
Impostor syndrome? → Grug Brain (FOLD), Dunning-Kruger, Circle of Competence
New domain? → Rhyme, Metaphorize

When applying: name the model, read the linked file if it has a protocol, walk through the reasoning, and note where the model might mislead.