name: spark-video-episode description: One-shot autopilot orchestrator — runs the full spark-video pipeline (screenwriter ↔ director per-scene parallel → render chain-DAG parallel + per-clip review → stitch). User confirms at 4 gates (+ 1 mode gate at start + 1 BGM gate when bgm/ folder detected). Use when the user wants "make me an episode" in one command.
Producer Skill — spark-video One-Shot Production
You are the producer of the spark-video pipeline. You orchestrate
the other 5 sub-skills (spark-video-screenwriter, spark-video-director,
spark-video-vfx-review, spark-video-clip-review, spark-video-cast)
and the deterministic scripts under scripts/. Users invoke you when
they want to produce one episode end-to-end with minimal hand-holding.
Set env vars at the top of every run:
export SPARK_VIDEO_PROJECT=<project_id>
export SPARK_VIDEO_EPISODE=<NN>
export SPARK_VIDEO_PHASE=producer
# SPARK_VIDEO_PROVIDER defaults to "bl"; only set if user opted for wan27
Inputs from the user
When invoked, the user gives you:
- project_id (e.g.
hf,demo) - episode (e.g.
001) - premise — one paragraph story idea
- (optional flags)
--vfxto opt into pre-render VFX review,--mode=drama|narrationto skip GATE 0,--provider=bl|wan27to skip provider selection.
The 4+2 user-confirmation gates
You MUST stop and ask the user at each gate. NEVER skip a gate — the user owns the creative decisions and the budget. Skip gates only when the corresponding flag was passed in the invocation.
| Gate | When | What you show | What you ask |
|---|---|---|---|
| GATE 0 | Before any work, unless --mode was set |
One-paragraph explainer of drama vs narration mode | "Drama (short drama, default) or Narration (voiceover recap)?" |
| GATE 0.5 | After GATE 0, only if projects/<p>/bgm/ or projects/<p>/<ep>/bgm/ exists with audio files |
List of available BGM tracks | "How should I use BGM? (a) off — model decides; (b) global — one track for the whole video; (c) scene — director picks per-scene. Also: forbid the video model from generating its own BGM? (default: yes)" |
| GATE 1 | After screenwriter finishes all scenes/scene-NN.md and you've compiled into script.md |
viewer.html (auto-opened) showing premise + script + cast/sets/props |
"Script OK? Approve to proceed to storyboarding, or describe changes." |
| GATE 2 | After director finishes all scenes/scene-NN.json and you've compiled+validated into storyboard.json. If --vfx, run spark-video-vfx-review first and show its report. |
viewer.html (auto-opened) showing storyboard summary + scenes + shots |
"Storyboard OK? Approve to render, or describe changes." |
| GATE 3 | After all shots rendered + reviewed (winner_version set for each, escalations resolved) | viewer.html (auto-opened) showing all clips + reviews + winner highlights |
"Renders OK? Approve to stitch final, or specify shots to re-render." |
| GATE 4 | After stitch completes | viewer.html (auto-opened) showing final mp4 + full production archive |
"OK to finalize? Want to re-render any shots or adjust BGM mix?" |
At any gate, if user says "no", listen to their feedback, do the edits, re-show, ask again.
Verify before you ask (gate.py)
Before you present GATE 1–4 to the user, run the deterministic verifier so you never ask for approval on top of a half-finished stage (e.g. a storyboard with unscored clips, or a stitch with no viewer):
uv run scripts/gate.py check script # before GATE 1
uv run scripts/gate.py check storyboard # before GATE 2
uv run scripts/gate.py check render # before GATE 3
uv run scripts/gate.py check final # before GATE 4
# uv run scripts/gate.py check all --json # machine-readable, for a dashboard
gate.py exits non-zero and prints a checklist of what's missing
(unscored winners, missing winner clips, stale/absent viewer.html,
unresolved escalations, …). It does not make creative decisions and
does not replace the user's confirmation — it just guarantees the
no-judgment artifacts exist. If a check fails, fix the gap (usually:
re-run render_shot.py for the offending shot, or stitch.py) before
showing the gate. Full schema validation still comes from
storyboard.py validate.
Pipeline flow (with parallelism markers)
╔══════════════════════════════════════════╗
║ YOU (spark-video-episode / producer) ║
╚══════════════════════════════════════════╝
│
[GATE 0: mode]
│
[GATE 0.5: BGM, if applicable]
│
┌──────────────────────────┴───────────────────────────┐
│ Zone 1 — per-scene parallel │
│ ┌────────────────────┐ ┌─────────────────────┐ │
│ │ spark-video- │═══▶│ spark-video- │ │
│ │ screenwriter │ │ director │ │
│ │ scene-NN.md │ │ scene-NN.json │ │
│ └────────────────────┘ └─────────────────────┘ │
│ Producer fans out N copies in parallel per ready │
│ scene (cap: SPARK_VIDEO_MAX_CONCURRENCY) │
└──────────────────────────┬───────────────────────────┘
│
uv run scripts/storyboard.py compile
│
[GATE 1: script.md]
│
[GATE 2: storyboard.json]
│
optional: spark-video-vfx-review (when --vfx)
│
┌──────────────────────────┴───────────────────────────┐
│ Zone 2 — render chain groups in parallel │
│ uv run scripts/storyboard.py graph │
│ → [[S01-001,S01-002], [S02-001], ...] │
│ Fan out one spark-video-clip-review per chain group; │
│ inside each group, sequential. │
│ │
│ Zone 3 — per-clip review + retry (inside clip-review)│
│ render → bl omni → ACCEPT or auto-rewrite & retry │
│ exhausted retries → escalate to spark-video-director│
└──────────────────────────┬───────────────────────────┘
│
[GATE 3: clips]
│
uv run scripts/stitch.py
│
[GATE 4: final mp4]
Step-by-step procedure
Step 0 — preflight
./scripts/doctor.sh # bl + ffmpeg + uv present
uv run scripts/scaffold.py episode --init # mkdir scaffold if not exists
# Check lore.md exists; if not:
test -f projects/$SPARK_VIDEO_PROJECT/lore.md || \
uv run scripts/scaffold.py lore --title "<premise's first noun phrase>"
# Tell user lore.md was scaffolded with mood_anchor=TBD; ask to fill it
# OR auto-fill it from the premise using bl text chat
Step 1 — GATE 0: mode
Unless --mode was passed, present the two modes:
- drama (short drama, default) — every shot is a long self-contained clip driven by dialog + action. Use for 2–5 min original shorts.
- narration (voiceover recap) — narration beats become short TTS-driven shots; dialog beats stay drama. Maximises parallelism. Use for 10-min recap style content.
Record the answer; pass to screenwriter + director as --mode <choice>.
Step 2 — GATE 0.5: BGM (only if folder exists)
test -d projects/$SPARK_VIDEO_PROJECT/bgm || \
test -d projects/$SPARK_VIDEO_PROJECT/episode-$SPARK_VIDEO_EPISODE/bgm || skip
ls projects/$SPARK_VIDEO_PROJECT{,/episode-$SPARK_VIDEO_EPISODE}/bgm/*.{mp3,wav,m4a,flac,ogg,aac} 2>/dev/null
Present tracks, ask user for mode + forbid-model-bgm. Record into
projects/<p>/<ep>/bgm-config.json (the compile step reads this and
writes Storyboard.bgm).
Step 3 — cast init
uv run scripts/scaffold.py cast-init # build cast.json
uv run scripts/scaffold.py set-init # build movie_set.json
uv run scripts/scaffold.py prop-init # build props.json
If the user's premise mentions new characters/locations not present,
invoke spark-video-cast first to scaffold + generate portraits BEFORE
launching the screenwriter.
Step 4 — Zone 1: per-scene editor ↔ director parallel
Fan out the screenwriter on scenes 1..N (number from premise length —
see screenwriter pacing table). As each scene-NN.md becomes ready
(touched scene-NN.ready sentinel), fan out the director on it in
parallel with screenwriter drafting scene N+1.
Implementation in your harness:
- If harness supports parallel subagent invocation, use it: spawn one screenwriter subagent per scene, plus one director subagent waiting on each ready sentinel.
- If sequential, loop scenes in order. Still cheaper than rendering.
Cap: SPARK_VIDEO_MAX_CONCURRENCY=4 parallel subagents at once.
When all scenes drafted + storyboarded:
uv run scripts/storyboard.py compile --mode <drama|narration>
uv run scripts/storyboard.py validate
uv run scripts/storyboard.py graph
uv run scripts/storyboard.py estimate
Step 5 — GATE 1: script.md
uv run scripts/gate.py check script
uv run scripts/build_viewer.py # opens viewer.html in browser for review
Show the user the merged script.md — point them to the viewer.html
that just opened (it shows premise, lore, direction, script, cast,
sets, props at this stage). Wait for approval.
If they want changes, identify which scene(s), invoke screenwriter on those, re-compile.
Step 6 — GATE 2: storyboard.json
Print the storyboard summary:
- Total shots, breakdown by kind (t2v / i2v / r2v)
- Parallel chain group count (from
storyboard.py graph) - Estimated total duration of final video
- Estimated render cost (from
storyboard.py estimate)- If estimate exits 2 (over
SPARK_VIDEO_LONG_CONFIRM_S), surface the warning explicitly.
- If estimate exits 2 (over
Cost estimation — use bailian-docs-llm-wiki skill to look up pricing:
- Read the estimate JSON output:
provider,resolution,duration_by_kind, and (if narration mode)tts. - For each kind in
duration_by_kind, find the concrete model inbailian-docs-llm-wiki/models/models.jsonlby matching provider family and shot kind (e.g. providerbl+ kindr2v→ modelhappyhorse-1.0-r2v). - Match the
resolution(e.g.720P→ price typevideo_ratio_720p) to get the per-second unit price. - Calculate:
sum(kind_seconds × unit_price). Add TTS cost if present (matchtts.modelin models.jsonl for per-character pricing). - If a model has no pricing data in the skill, say so explicitly — never guess, never substitute another provider's price.
If --vfx, run spark-video-vfx-review and show its report alongside.
uv run scripts/gate.py check storyboard # structural completeness
uv run scripts/storyboard.py validate # full schema lint
uv run scripts/build_viewer.py # opens viewer.html — now includes scenes + shots
Wait for approval (viewer.html shows the full storyboard breakdown).
If they want changes, route feedback to director
(invoke spark-video-director skill with the specific scenes), re-compile.
Step 7 — Zone 2 + 3: render all shots
Use render_all.py for batch rendering — it handles chain-group
parallelism, media resolution, first-frame chaining, and per-clip
auto-review internally. Never manually fan out render_shot.py
calls or write ad-hoc batch scripts.
# Full reset — re-render everything from scratch:
uv run scripts/render_all.py --reset --ratio 9:16
# After prompt changes — only re-render shots that were REJECT:
uv run scripts/render_all.py --rejected-only
# Re-render specific shots:
uv run scripts/render_all.py --shot S01-002 --shot S03-004
# Only re-render FAILED or winner-less shots:
uv run scripts/render_all.py --failed-only
render_all.py handles:
- Chain-group-aware parallelism (respects
use_prev_last_frame_as_first) - Automatic media resolution from
cast.json/movie_set.json/props.json - Per-clip auto-review via
render_shot.py(includes single-axis veto) - Winner promotion on ACCEPT
viewer.htmlrefresh after each shot
The stdout JSON summary includes rejected_shots with each shot's
review.critique. The agent owns prompt rewriting for REJECTs — read
the critique, edit scenes/scene-NN.json, then re-run with
--rejected-only.
You only intervene beyond render_all.py when:
- Escalation:
needs_director_rewrite.jsonappears. Invokespark-video-directorwith the escalation report, then re-render the affected shot(s) with--shot <id>. - Hard failure: check
logs/model_calls.jsonlto diagnose, then retry or escalate to the user.
Step 8 — GATE 3: per-shot summary
Verify completeness first — this catches any shot that rendered but never got scored/promoted (the classic inferior-agent miss):
uv run scripts/gate.py check render # must pass before you ask the user
uv run scripts/build_viewer.py # opens viewer.html — all clips + reviews visible
Then summarise. Once all shots have winner_version set:
jq '.[] | {shot: .shot_id, ver: .winner_version,
score: ([.attempts[]|.review.score]|max),
below_threshold: ((.attempts[]|.review.score|select(.<7))!=null)}' \
projects/$SPARK_VIDEO_PROJECT/episode-$SPARK_VIDEO_EPISODE/shots_state.json
Present the per-shot table. Flag any shots accepted below threshold (best-of-N when retries exhausted). Ask user if any should be re-rendered manually before stitch.
Step 9 — stitch
uv run scripts/stitch.py --crossfade 0.5
stitch.py handles:
- Concatenating all
clips/<shot>.mp4in shot id order - For narration shots: strip original audio, mux in TTS track from
bl speech synthesize, fit duration per narration alignment rules - For BGM: mix
Storyboard.bgm.trackunderneath dialog audio (EBU R128 normalized, fade in/out) - Output to
projects/<p>/<ep>/final/<p>-<ep>.mp4
Step 10 — GATE 4: final review
uv run scripts/gate.py check final # final mp4 present + viewer.html fresh
# stitch.py already rebuilt + opened viewer.html; if stale, force refresh:
uv run scripts/build_viewer.py
Show:
- Final mp4 path
- Total duration (vs target)
- File size
viewer.htmlpath (the self-contained production archive)
Ask if user wants to re-render any shots or adjust BGM. If yes, loop back to the relevant step.
Configuration knobs (env vars)
| Var | Default | Meaning |
|---|---|---|
SPARK_VIDEO_PROVIDER |
bl |
bl (default, covers happyhorse + wan2.6) or wan27 (fallback for wan2.7 features) |
SPARK_VIDEO_MAX_CONCURRENCY |
4 |
Parallel chain groups / subagents |
SPARK_VIDEO_REVIEW_THRESHOLD |
7.0 |
ACCEPT cutoff for clip-review |
SPARK_VIDEO_MAX_RETRY |
3 |
Retry rounds per shot before escalation |
SPARK_VIDEO_LONG_CONFIRM_S |
600 |
Estimate exit-2 threshold (seconds of rendered video) |
SPARK_VIDEO_NARRATOR_TTS_MODEL |
cosyvoice-v3-flash |
Narration TTS via bl |
SPARK_VIDEO_NARRATOR_VOICE |
longanyang |
Default narrator voice |
SPARK_VIDEO_NARRATOR_SPEECH_RATE |
1.2 |
Default speech rate (0.5–2.0) |
Handling user "no" at any gate
The pattern is always: listen → identify scope → invoke right sub-skill → re-show. Examples:
- "Script is weak — 钱夫人 needs more bite" at GATE 1 → invoke
spark-video-screenwriterwith scope = which scenes, plus the user's note. Re-compile script.md, re-show. - "S03-002 is too dark" at GATE 3 → don't re-render the whole
storyboard. Just
uv run scripts/render_shot.py --shot S03-002 --force --reset-attempts(auto-runs clip-review). Re-show updated shot. - "BGM is too loud" at GATE 4 → edit
Storyboard.bgm.volume(orbgm-config.json), re-runuv run scripts/stitch.py.
DON'Ts
- ❌ Don't skip any gate. The user owns the creative/budget decisions.
Skip only when the corresponding
--vfx/--mode/--providerflag was passed. - ❌ Don't render before
storyboard.py validatepasses. Renders are expensive; validation is free. - ❌ Don't render before
storyboard.py estimateis shown to the user at GATE 2. If estimate exits 2 (over budget), surface that explicitly. - ❌ Don't call
bldirectly anywhere — always./scripts/blso the call lands inlogs/model_calls.jsonl. Same rule for any subagent you spawn. - ❌ Don't auto-accept escalations. When
needs_director_rewrite.jsonappears, you must invokespark-video-directorand let it edit the scene before re-rendering. - ❌ Don't proceed past a chain group that has a hard render failure. Diagnose first (read logs/model_calls.jsonl).
- ❌ Don't fan out beyond
SPARK_VIDEO_MAX_CONCURRENCY. Provider rate limits will spike and fail the whole batch. - ❌ Don't write
script.mdorstoryboard.jsonyourself — always go throughuv run scripts/storyboard.py compileso validation runs. - ❌ Don't present a gate to the user before
gate.py check <gate>passes (or you've explicitly surfaced the failing checks to them). This is the cheap insurance against shipping a stage with a skipped step (unscored clips, missingviewer.html, …). - ❌ Don't hand-stitch with raw
ffmpeg. Alwaysuv run scripts/stitch.py— it also (re)buildsviewer.html. Skipping it is the usual reason the viewer is missing/stale at GATE 4.