name: script-compose description: >- Handles explicit screenplay/story work on the filmmaking canvas. Triages screenplay (use verbatim), story/concept (iterate then rewrite), or neither (defer). Captures the final script note/title; on explicit command, splits into <=15s shot notes and extracts characters, variants, locations, and speaking/VO needs. Use for writing, adapting, rewriting, splitting, analyzing, or breaking down scripts/stories. Preserves dialogue verbatim and returns multi-stage planning to story-to-video-workflow before media generation. Does not split/analyze on file drop or without explicit intent.
Run only on explicit user intent, never on file drop. Dropped text/PDF already exists as a note (data.body) and mirror (./assets/notes/<note_id>.md).
Defaults: a 30s beat is one moment; match the input language; with characters, prefer meaningful dialogue and let narration support rather than carry the scene, but let wordless action carry a beat instead of packing every one with dialogue.
Stop at script capture, shot notes, and anchor extraction. Route multi-stage work back to story-to-video-workflow.
Capture target duration when observable:
- Explicit user duration wins ("30 seconds", "2-minute short").
- Timestamp blocks come next; sum them.
- Otherwise estimate roughly and mark it as an estimate.
Store target_duration_sec and duration_basis when known. If implied runtime is >~3 minutes, call out scope before shot/video planning.
1. Triage → Capture
Classify the input, then capture as in §2. Never skip straight to §3.
- Screenplay (INT./EXT. + ALL-CAPS cues + dialogue) → use verbatim. For dropped text/PDF, read
workflow.jsondata.bodyor./assets/notes/<note_id>.md. Pick a 2–5 word title; identify duration basis; do not rewrite to fit. - Story / concept (prose, pitch, logline) → sketch ONE paragraph back (setting, characters, conflict, target duration) and ask if it's the shape. Iterate. On "yes/go", rewrite using the rules below, then capture.
- Neither → don't run; defer to
image-compose/video-compose.
Torn between screenplay and story? Prefer screenplay — safer than rewriting.
Rewrite rules (story → screenplay):
- Format:
INT./EXT. LOCATION - TIMEslug, present-tense action, ALL-CAPS cue + dialogue. No scene numbering. No camera directions (that'svideo-compose). - Preserve user-quoted dialogue verbatim.
- With characters, include dialogue that reveals motive/conflict/relationship; avoid narration-only exposition unless VO-driven.
- Pace speech at ~2.2-2.5 words/sec plus reaction/action room.
- Duration: match if stated; default 30–45s. Don't overshoot.
- Short input, longer target? Keep verbatim and ask "reads as ~Ns; extend?" — don't silently pad.
2. Capture — canvas note + title
ONE note. No split. Canvas writes go through the mutator, never direct workflow.json.
read./workflow.json(read-only inspection — see iftitleis already set).- Append the script note via the mutator with
subtype: "script":
Omitnode "$PAI_REPO_ROOT/server/cli/canvas_mutate.js" \ --op addNode \ --payload-json '{"node":{"type":"note","data":{"subtype":"script","label":"Script: <title>","body":"<full screenplay verbatim>","metadata":{"author":"agent","timestamp":"<ISO>","target_duration_sec":45,"duration_basis":"estimated from script length"}}}}'target_duration_sec/duration_basisonly when there is no defensible signal. Stdout returnsassigned.node_id— keep it for §3 (shots derive from this id). - Set the workflow title if empty:
node "$PAI_REPO_ROOT/server/cli/canvas_mutate.js" --op setTitle --payload-json '{"title":"<title>"}' - Confirm with
Captured., then offer the next step as a choice rendered per the projectPROJECT_AGENT.md§ "Recommendation and choice shape". Recommended option: "Split it into <=15s shots and extract characters/locations/voices." Plus an escape to do something else.
STOP. Do NOT proceed to §3 without an explicit user command.
3. Analyze — on explicit user command
Triggers (judge intent): "split into shots / clips", "break this up", "pull the characters / locations", "who's in this", "analyze this script", "design the characters from this script". Not triggers: "what's in this", "summarize", "tell me about it" — those are read-and-reply.
When triggered:
- Slug — kebab-case of the working title. Collision → suffix
-2,-3. - Shot splits (≤15s each): use
metadata.target_duration_secor estimate. Split on natural beats (slug/dialogue/location/time/appearance changes). For >15s material, keep resulting shots as close to 15s as natural (default ≈ceil(total_seconds / 15)shots); split shorter only for hard cuts, dialogue turns, continuity shifts, or strong beats — don't over-fragment just because the script's own time markers say so. Pace speech at ~2.2-2.5 words/sec plus reaction/action room; silent action ~3–5s. If dialogue cannot fit naturally, split it; reduce only when the user asked for compression. Never rewrite — shot bodies are verbatim slices. Each shot note hassubtype: "shot". Build oneaddBatchwith N shot notes + N derived edges:node "$PAI_REPO_ROOT/server/cli/canvas_mutate.js" \ --op addBatch \ --payload-json '{ "nodes": [ {"type":"note","data":{"subtype":"shot","label":"Shot 1 (0–15s)","body":"<slice>","metadata":{"author":"agent","timestamp":"<ISO>"}}}, {"type":"note","data":{"subtype":"shot","label":"Shot 2 (15–30s)","body":"<slice>","metadata":{"author":"agent","timestamp":"<ISO>"}}} ], "edges": [ {"from":"<script_note_id>","to":"$0","kind":"derived"}, {"from":"<script_note_id>","to":"$1","kind":"derived"} ] }'$Nplaceholders are 0-indexed positions innodes; the mutator resolves them to the assigned ids after running. Reply'sassigned.node_idsis the array of shot ids in the same order. - Anchor extraction — from the shot bodies, extract only downstream needs:
- Characters: recurring/visually important people/entities. Include one-line base visuals only when given.
- Variants: same character with materially different on-screen look by scene/shot: age jump, costume change, injury, disguise, transformation, wet/dirty/bloodied state if it must persist across shots. Do not create variants for transient expressions or tiny props.
- Locations: distinct settings plus same-setting variants when framing/scale, time, weather, lighting, dressing, story state, or close/detail coverage matters.
- Voices: every speaking character and narration/V.O.; preserve speaker labels and dialogue language.
- Missing anchors: first character, variant, location, or voice that blocks rendering Shot 1.
- Parse offer — ONE compact planning line plus a soft next step:
Plan check: ~<seconds>s, <shots> shots, <N> character(s), <V> variant(s), <M> location(s), <S> voice need(s). Missing: <first blocker>.If N>0, V>0, M>0, or S>0, offer next step with project choice shape. Recommended: "Design the character/location anchors, then voices." Anchors include base/variant character sheets, detailed location/location variants, and speaker/VO voice anchors. On approval, route toimage-composefirst (base character sheets, needed character variants, and location stills) with--source-node-id <script_note_id>so the new nodes wire back to the script. After image anchors land, route speaking/narration needs tovoice-compose. Don't generate insidescript-compose. Skip the offer if every count is 0.
If the user's command was narrower ("just the shots", "only characters"), do only that sub-step and skip the offer.
4. Revisions
Surgical (title still fits): update script-note body + affected shot bodies in place. Use the mutator's updateNode op (one call per node, or batched via updateBatch).
Structural (title no longer fits): new script note (addNode); old→new edge addEdge with kind:"derived"; new shot family via addBatch against the new script note. Leave old shots; delete only if asked (deleteNode cascades edges for you).