name: p2v-phase-1-script description: Generate and validate video_script.jsonl from a specific paper URL or pasted paper content. Use this when running phase 1 of the paper-to-video pipeline. metadata: short-description: P2V phase 1 script generation
P2V Phase 1: Script
When to use
Use this skill when the user wants phase 1 of paper-to-video: paper input to validated video_script.jsonl.
Inputs
- paper URL or full paper text
- output run directory (default:
outputs/<video_id>-<timestamp>)
Workflow
- Run a mandatory preparation pass before drafting any script lines.
- Follow these local guides:
docs/educational-video-pedagogy-framework.mddocs/00-system-contract.md
- Draft one coherent educational script from the preparation results (not directly from raw paper text).
- Enforce the contract fields in
video_script.jsonl. - Save as
video_script.jsonlin the run folder. - Validate:
uv run python -c "from pathlib import Path; from paper2video.contracts.io import validate_artifact; validate_artifact(Path('<video_script.jsonl>'), artifact_type='video_script'); print('video_script contract ok')"
Required output
<run_dir>/video_script.jsonl
Mandatory Preparation Pass (Internal, Phase-1 only)
Before writing the first record, the agent must do this internally:
- Scientific extraction
- core claims
- mechanism details
- strongest evidence/ablations
- assumptions, caveats, and failure modes
- Pedagogical recomposition
- learner-first sequence (not paper section order)
- narrative arc: hook -> setup -> mechanism -> evidence -> limits -> synthesis
- prerequisite and misconception map
- Script planning
- chapter plan with explicit didactic objective per chapter
- segment purpose statements that justify each segment
- duration estimate based on paper complexity, evidence density, and mechanism depth
Do not ask the user for these artifacts. Build them internally, then emit only video_script.jsonl.
Complexity-To-Depth Policy (Required)
Before drafting, assign a complexity tier using paper content:
tier_1(simple conceptual paper): one main claim, light empirical evidencetier_2(moderate): multiple claims, some formal or empirical detailtier_3(dense empirical/mechanistic): many experiments/ablations and non-trivial mechanismtier_4(very dense): tier_3 plus multiple interacting mechanisms or heavy formal load
Use this mapping for script depth:
tier_1: 700-1100 words (~5-8 min)tier_2: 1100-1700 words (~8-13 min)tier_3: 1700-2600 words (~13-20 min)tier_4: 2400-3600 words (~18-28 min)
For ML empirical papers with broad ablations and mechanism discussion (like grokking-style papers), default to tier_3 unless there is strong evidence for tier_4.
If draft word count is below tier minimum, expand with:
- deeper protocol/mechanism walkthrough
- evidence decomposition (main curve + secondary curve + failure/negative case)
- ablation interpretation and caveats
- replication-oriented synthesis
Depth And Specificity Rules
The script must reflect expert-level understanding:
- Include concrete paper details where possible:
- dataset/task setup
- model or method specifics
- key experimental findings
- important limitations
- Avoid generic summaries that could apply to any paper.
- Tie claims to evidence in the narration flow.
- Use explicit transitions that preserve technical continuity.
- Duration is paper-dependent:
- do not force a fixed runtime target
- simple papers can be shorter
- complex papers should expand enough to cover mechanism and evidence thoroughly
- Do not collapse dense papers into a short executive summary.
- if the paper has multiple non-trivial empirical findings, include enough segments to teach each finding causally.
If the current draft feels generic, refine before finalizing.
Narration Voice Rules (Required)
narration_text must sound like an educational video, not a lecture outline:
- Never use meta-outline phrasing inside narration text:
- avoid:
Chapter 1,Chapter 2,Section,Lecture,In this chapter
- avoid:
- Keep chapter metadata in fields (
record_type=chapter,chapter_id) but keep spoken text natural. - Prefer direct viewer-facing transitions:
- examples:
Now let’s test this on...,Next we inspect...,Here’s the key result...
- examples:
- Avoid production/meta instructions in narration:
- no references to script-writing process, tiers, or internal planning artifacts.
Didactic Density Rules (Required)
Keep the script teachable for video viewers (not only technically correct):
- One core idea per narration unit.
- each
segmentshould deliver one primary teaching point plus at most one supporting point.
- each
- Control spoken numeric load.
- prefer
about,roughly,on the order ofin speech. - keep exact values for only the most important numbers in a segment.
- move secondary precision to visuals/overlays, not spoken prose.
- prefer
- Split dense units.
- if a segment contains more than two major claims, split it into two sequential segments.
- Keep recaps short and retrieval-oriented.
- recap units should be concise and phrased as punchline reinforcement.
- Appendix-grade statistics are optional in narration.
- exact p-values, detailed correlation coefficients, and low-priority appendix numbers should be omitted from spoken text unless essential to the main claim.