p2v-phase-1-script

star 1

Generate and validate video_script.jsonl from a specific paper URL or pasted paper content. Use this when running phase 1 of the paper-to-video pipeline.

JoaquinCampo By JoaquinCampo schedule Updated 3/8/2026

name: p2v-phase-1-script description: Generate and validate video_script.jsonl from a specific paper URL or pasted paper content. Use this when running phase 1 of the paper-to-video pipeline. metadata: short-description: P2V phase 1 script generation

P2V Phase 1: Script

When to use

Use this skill when the user wants phase 1 of paper-to-video: paper input to validated video_script.jsonl.

Inputs

  • paper URL or full paper text
  • output run directory (default: outputs/<video_id>-<timestamp>)

Workflow

  1. Run a mandatory preparation pass before drafting any script lines.
  2. Follow these local guides:
    • docs/educational-video-pedagogy-framework.md
    • docs/00-system-contract.md
  3. Draft one coherent educational script from the preparation results (not directly from raw paper text).
  4. Enforce the contract fields in video_script.jsonl.
  5. Save as video_script.jsonl in the run folder.
  6. Validate:
uv run python -c "from pathlib import Path; from paper2video.contracts.io import validate_artifact; validate_artifact(Path('<video_script.jsonl>'), artifact_type='video_script'); print('video_script contract ok')"

Required output

  • <run_dir>/video_script.jsonl

Mandatory Preparation Pass (Internal, Phase-1 only)

Before writing the first record, the agent must do this internally:

  1. Scientific extraction
    • core claims
    • mechanism details
    • strongest evidence/ablations
    • assumptions, caveats, and failure modes
  2. Pedagogical recomposition
    • learner-first sequence (not paper section order)
    • narrative arc: hook -> setup -> mechanism -> evidence -> limits -> synthesis
    • prerequisite and misconception map
  3. Script planning
    • chapter plan with explicit didactic objective per chapter
    • segment purpose statements that justify each segment
    • duration estimate based on paper complexity, evidence density, and mechanism depth

Do not ask the user for these artifacts. Build them internally, then emit only video_script.jsonl.

Complexity-To-Depth Policy (Required)

Before drafting, assign a complexity tier using paper content:

  • tier_1 (simple conceptual paper): one main claim, light empirical evidence
  • tier_2 (moderate): multiple claims, some formal or empirical detail
  • tier_3 (dense empirical/mechanistic): many experiments/ablations and non-trivial mechanism
  • tier_4 (very dense): tier_3 plus multiple interacting mechanisms or heavy formal load

Use this mapping for script depth:

  • tier_1: 700-1100 words (~5-8 min)
  • tier_2: 1100-1700 words (~8-13 min)
  • tier_3: 1700-2600 words (~13-20 min)
  • tier_4: 2400-3600 words (~18-28 min)

For ML empirical papers with broad ablations and mechanism discussion (like grokking-style papers), default to tier_3 unless there is strong evidence for tier_4.

If draft word count is below tier minimum, expand with:

  • deeper protocol/mechanism walkthrough
  • evidence decomposition (main curve + secondary curve + failure/negative case)
  • ablation interpretation and caveats
  • replication-oriented synthesis

Depth And Specificity Rules

The script must reflect expert-level understanding:

  1. Include concrete paper details where possible:
    • dataset/task setup
    • model or method specifics
    • key experimental findings
    • important limitations
  2. Avoid generic summaries that could apply to any paper.
  3. Tie claims to evidence in the narration flow.
  4. Use explicit transitions that preserve technical continuity.
  5. Duration is paper-dependent:
    • do not force a fixed runtime target
    • simple papers can be shorter
    • complex papers should expand enough to cover mechanism and evidence thoroughly
  6. Do not collapse dense papers into a short executive summary.
    • if the paper has multiple non-trivial empirical findings, include enough segments to teach each finding causally.

If the current draft feels generic, refine before finalizing.

Narration Voice Rules (Required)

narration_text must sound like an educational video, not a lecture outline:

  1. Never use meta-outline phrasing inside narration text:
    • avoid: Chapter 1, Chapter 2, Section, Lecture, In this chapter
  2. Keep chapter metadata in fields (record_type=chapter, chapter_id) but keep spoken text natural.
  3. Prefer direct viewer-facing transitions:
    • examples: Now let’s test this on..., Next we inspect..., Here’s the key result...
  4. Avoid production/meta instructions in narration:
    • no references to script-writing process, tiers, or internal planning artifacts.

Didactic Density Rules (Required)

Keep the script teachable for video viewers (not only technically correct):

  1. One core idea per narration unit.
    • each segment should deliver one primary teaching point plus at most one supporting point.
  2. Control spoken numeric load.
    • prefer about, roughly, on the order of in speech.
    • keep exact values for only the most important numbers in a segment.
    • move secondary precision to visuals/overlays, not spoken prose.
  3. Split dense units.
    • if a segment contains more than two major claims, split it into two sequential segments.
  4. Keep recaps short and retrieval-oriented.
    • recap units should be concise and phrased as punchline reinforcement.
  5. Appendix-grade statistics are optional in narration.
    • exact p-values, detailed correlation coefficients, and low-priority appendix numbers should be omitted from spoken text unless essential to the main claim.
Install via CLI
npx skills add https://github.com/JoaquinCampo/paper2video --skill p2v-phase-1-script
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
JoaquinCampo
JoaquinCampo Explore all skills →