narrative-building - SKILL.md Agent Skill

name: narrative-building description: Draft or audit scientific introductions: argument logic, framing, multi-experiment coherence. argument-hint: "[describe your paper topic or introduction challenge]"

Scientific Narrative Builder

Related skills. This skill composes with hypothesis-building (where the "Why" funnel lands as a falsifiable "If-Then" with a named estimand), methods-reporting (where the design the narrative promises gets documented to APSA/JARS/DA-RT standards), and pre-registration-writing (where the narrative locks in before data collection).

Workflow

Author or audit an introduction by walking these five steps in order. Each step has a specific output; do not proceed until the prior step's output exists in the draft. See reference/example-funnel.md for a worked example.

Step 1 — Why: establish the substantive motivation

Output: one or two paragraphs naming the real-world social/political tension the study addresses, the specific "invisible" the design will reveal, and the analytical joint where existing theory is silent or conflicted. Do not frame the motivation as a gap-in-literature; frame it as a stake in the world.

Identify the "Invisible": Treat surveys and experiments not just as data collection, but as a process to uncover "invisible factors", specifically perceptions, knowledge, beliefs, and reasoning that administrative data cannot capture.
Carve the Analytical Joints: Do not settle for a broad topic. Identify the specific analytical joint or tension where existing theory is silent or conflicted -- "creativity in design is grounded in a capacity to carve a problem at its analytical joints" (Sniderman 2018). The introduction must articulate the specific "why" of the phenomenon before proposing the "how" of the experiment.
The ASK Framework: Druckman (2022) identifies three pathways to research questions: Assessing (observing the world and noticing puzzles), Socializing (conversations with colleagues, students, and practitioners), and Kaput (learning from failed experiments -- "the first thing one should do when an experiment fails is to ask why it failed"). The narrative should be able to trace the question's origin to one of these pathways.
Resist the "Methods-Driven" Temptation: Ensure the research question dictates the method, not vice versa. Explicitly defend why an experiment is the necessary tool for this specific question. "It is crucial to not jump to designs for their novelty, but only to turn to them when they offer an advantage over what could otherwise have been done" (Druckman 2022). Note that "writing your survey questions is already part of the analysis stage" (Stantcheva 2023) -- question design choices are analytical decisions, not logistical ones, and the narrative should convey this.
Contextualization: Situate the study within a "broader account of politics" or social life. The narrative must move from the general social importance to the specific theoretical gap. For policy-relevant studies, explicitly connect the research question to the target policy environment and the populations affected.
Multi-Level Theoretical Framing: When a study includes experiments at different levels of analysis (e.g., individual-level and institutional-level), the introduction must establish the theoretical bridge between levels. State explicitly how reasoning at the micro level (e.g., fairness judgments about individual immigrants) connects to reasoning at the macro level (e.g., legitimacy judgments about governance decisions). The bridge should be conceptual, not just methodological -- explain why the same theoretical mechanism should operate at both levels.

Step 2 — Evidence audit: what the literature actually establishes (and where it fails)

Output: a literature review section that states, for each prior finding cited, (a) what observable implication it establishes, (b) what design features limit its generalizability, and (c) what remains unresolved. The audit must not collapse into a chronological "studies find X" recital.

Systematic Evidence Assembly: Move beyond "selective storytelling." Favor the assembly of estimates from meta-analyses or registries. If these do not exist, explicitly document the search and inclusion criteria to avoid publication bias. Study registries (EGAP, AEA, OSF) make the existence of studies visible even when their results go unpublished -- reference this universe of studies when available (Christensen et al. 2019).
Publication Bias Acknowledgment: When reviewing prior findings, explicitly note where the published literature may overstate effect sizes due to the "file drawer problem" -- the systematic non-publication of null results. If a registered universe of studies exists, the narrative should reference it. Note where prior effect sizes may be inflated by publication bias, and frame the current study's power analysis conservatively (Druckman 2022; Lakens 2025).
Adjudicating Theories: Clarify if the literature review is setting up a "Fact Searching" mission (estimating a causal effect) or a "Theory Testing" mission (adjudicating between competing psychological or social mechanisms). The stronger narrative frame is "explication, not demonstration" -- the goal is to explain why effects occur, not merely to show they exist (Sniderman 2018).
The Three Modesties and Sample Modesty: Acknowledge Sniderman's (2018) three limitations of survey experiments -- "modesty of treatment, modesty of scale, and modesty of measurement" (single dependent variable, single indicator, short duration) -- and separately acknowledge Mutz's (2011) point on sample modesty: an accessible population may not match the target population ("college students may not be people"). The narrative should also acknowledge a fourth modesty: the current design relative to the full theoretical question. Frame the study as one link in a progression of trials -- what Sniderman (2018) calls a "chain of discovery" extending and cross-validating lines of research -- rather than a one-off decisive demonstration.
Replication-Extension Strategy: When building on prior work, frame the relationship as a "replication-extension" -- the current study replicates key features of an established design while extending it to address new questions or boundary conditions. Prior studies are not just evidence to cite but designs to build upon (Druckman 2022).
Convenience Sample Limitations: When reviewing findings from prior convenience-sample experiments (e.g., student subject pools), explicitly acknowledge that treatment effects may be heterogeneous across the population. Effects observed in narrow samples may not replicate in broader populations, and the direction or magnitude may change (Mutz 2011, citing Rashotte and Webster).
Distortions in the Evidence Base: Rather than attributing motive ("bad faith"), target observable behaviors that bias the literature. Note where cited findings may be distorted by specification searching, selective reporting, researcher degrees of freedom, or publication incentives (Christensen, Freese, and Miguel 2019; Simmons, Nelson, and Simonsohn 2011; Wicherts et al. 2016). Where an original study's analytical choices are not pre-registered, treat its reported effect size as an upper bound, and calibrate the current study's power and severity framing accordingly.

Step 3 — Funnel: narrow Why to If-Then

Output: a transition section that moves from the general theoretical claim (Step 1) through the specific counterfactual the design creates to the statistical hypothesis the data will test. The funnel must name the target population, the identifying variation, and the estimand before stating the hypothesis.

The Funnel Structure: Organize the narrative into a "Why" funnel. Start with the general theory (the explanation of why things happen in the world) and narrow down to the specific hypothesis (the "If-Then" statement of what will be observed in this data).
Defining the Target Population: Do not wait for the methods section to define the population. The introduction must specify the scope of the theory. Who exactly does this explanation apply to (e.g., the general citizenry vs. a specific elite subgroup)? If using a representative sample, frame this as enabling direct estimation of the Population Average Treatment Effect (PATE) rather than requiring extrapolation from a convenience sample (Mutz 2011).
Identifying Variation: Narratively describe the "identifying variation" you intend to create. What specific "controlled variation" is being introduced to unveil the invisible factor?
Name the Estimand in the Funnel: Before stating the hypothesis, name the theoretical estimand and its empirical counterpart so that what the data will estimate is visible in the narrative, not buried in the analysis section (Lundberg, Johnson, and Stewart 2021). The narrative-level estimand statement is what hypothesis-building later operationalizes as a three-level specification.
Severity Framing: Frame the study not just in terms of whether the hypothesis is confirmed, but in terms of how severe the test is -- how capable the design is of falsifying the prediction if it is wrong. A study with high severity provides more informative evidence regardless of the outcome (Lakens 2025).
Exploratory-Confirmatory Positioning: Honestly position the study on the exploratory-confirmatory spectrum (Lakens 2025, citing Waldron and Allen 2022). Not every study needs to test pre-registered hypotheses -- pilot studies, exploratory studies, and "tightening phase" studies that build toward confirmatory tests are legitimate and valuable. The introduction should state clearly where the study falls on this spectrum. Where the study is confirmatory, the narrative should signal that the funnel was locked prior to data collection (Nosek et al. 2018).

Step 4 — Multi-experiment bridges: logical dependency between experiments

Output (only if the paper contains more than one experiment): transitional paragraphs that state what Experiment k established, what Experiment k+1 adds that k could not resolve, and why the addition is logically required rather than a menu choice. Each experiment should read as a step in an argument, not an item in a list.

Experiment-Level Objectives: Each experiment in a multi-experiment study should have a clear, distinct objective that the introduction previews. Avoid describing experiments as mere replications of each other -- articulate what each contributes uniquely (e.g., Experiment 1 tests micro-level fairness; Experiment 2 tests macro-level governance legitimacy).
Cumulative Research Program: Frame the multi-experiment study as part of a cumulative research program where each experiment is "self-contained" while building on the previous one. Resources are directed toward "subsequent studies that build on what was learned in the first" rather than toward exhaustive omnibus surveys (Mutz 2011).
Sequential Factorial Narrative: When using a "sequential factorial" design (Sniderman 2018), where Experiment 2 "splices in" additional factors to probe mechanisms discovered in Experiment 1, the narrative should preview this logic: "Experiment 1 establishes [the basic effect]; Experiment 2 introduces [the new factor] to test whether [the mechanism] holds under [the new condition]."
Bridging Across Experiments: Write transitional language that connects experiments. The objective of Experiment 2 should explicitly reference what Experiment 1 established and explain what Experiment 2 adds. Use language like "Building on Experiment 1's focus on..., Experiment 2 shifts to..." or "Experiment 1 isolates the micro-foundations; Experiment 2 tests whether these extend to..."

Step 5 — Coherence check: every section earns its place

Output: a final pass in which each paragraph of the introduction is tagged with the thesis claim it supports, and any paragraph that does not support a claim is either revised or cut. The enumerated contributions list at the end of the introduction must map one-to-one onto design features actually present in the paper.

Theoretical Contributions as a List: Near the end of the introduction, enumerate the study's contributions explicitly (e.g., "This study offers three key contributions: (1)... (2)... (3)..."). Each contribution should map to a specific experiment or design feature.
Calibrate Contribution Claims: Inflate neither the novelty nor the generalizability of what the design can deliver. A single study with undisclosed analytical flexibility can produce almost any headline finding (Simmons, Nelson, and Simonsohn 2011); the contribution list should therefore be scoped to what a reader could reproduce from the pre-registered design, not to everything the analysis could, in principle, be pushed to say.
Responding to Critique in the Narrative: When a design has been revised in response to peer or workshop feedback, the narrative should incorporate the methodological improvement as a theoretical strength, not merely an appendix note. For example, if the redesign introduces a group-threat manipulation that was absent from the original, frame this as addressing a gap in the theoretical adjudication -- "the design enables a direct test of whether fairness reasoning withstands group-threat activation" -- rather than as a correction of a flaw.
Narrative-Design Coherence: After completing all design components, verify that the narrative still matches the design. If conditions were trimmed for power reasons or the design was simplified during development, the narrative must be adjusted so it does not promise more than the design can deliver (Druckman 2022).

Quality Checks

Substantive Foundation: Is the question grounded in a real-world social or political tension?
Invisible Factors: Does the intro name the specific belief or perception it aims to reveal?
Analytical Joints: Is the "joint" where the theory breaks or conflicts clearly identified?
Counterfactual Logic: Does the narrative establish what the "world without the treatment" looks like?
Population Scope: Is the target population explicitly named in the text?
Estimand Named in Funnel: Does the funnel name the theoretical estimand before stating the hypothesis?
Objective Review: Does the literature review avoid "cherry-picking" results that only support the hypothesis? Is publication bias acknowledged as a threat to the cited evidence base?
Modesties: Does the narrative acknowledge Sniderman's three modesties (treatment, scale, measurement), Mutz's sample modesty, and the modesty of the current design relative to the full theoretical question?
Multi-Level Bridge: If the study spans levels of analysis, is the conceptual bridge between levels explicitly articulated?
Experiment Differentiation: Does each experiment have a clearly stated, distinct objective?
Contribution Enumeration: Are the study's contributions listed explicitly and mapped to design features?
Contribution Calibration: Are contribution claims scoped to what the pre-registered design can deliver, not to post-hoc possibilities?
Design Improvement as Strength: If the design was revised, is the improvement framed as a theoretical advance rather than a correction?
Exploratory-Confirmatory Position: Does the introduction state where the study falls on the exploratory-confirmatory spectrum?
Narrative-Design Coherence: Does the introduction promise only what the final design can deliver?