name: banana-grid-shot-prompt description: Use when converting shot-design results into Banana 3x3 storyboard grid JSON prompts for the current flashxiaolagu mainline. This skill is assembly-only and should write short natural-language cinematic beats, not rigid keyword chains.
Banana Grid Shot Prompt
Overview
This skill converts shot-design.json into machine-readable 3x3 storyboard grid prompt batches.
It is not a creative writing stage.
Its job is only:
- read shot-design
- compress each panel into a short cinematic beat
- group panels into 3x3 batches
- preserve reference image mapping and global constraints
Authority
Follow:
${OPENCLAW_HOME:-~/.openclaw}/agents/flashxiaolagu制作/production-optimization-spec.md${OPENCLAW_HOME:-~/.openclaw}/agents/drama-producer/knowledge/03-translation/banana-pro-translation.md
If there is a conflict, the current flash spec wins.
Input
analysis/film-type-analysis.jsonanalysis/shot-design.json- generated asset paths for characters / scenes / props
This stage must inherit:
selected_option.ratio_designselected_option.style_design
If shot-design.json already carries explicit per-shot aspect_ratio, that shot-level value is the authority for that shot.
If upstream ratio data is missing or contradictory, stop and flag the mismatch instead of guessing.
Output
prompts/banana-grid-shot-prompts.mdprompts/banana-grid-shot-prompts.json
Core Rules
1. Assembly only
Do not invent:
- new shots
- new staging logic
- new emotions not present in shot-design
- new camera grammar not present in shot-design
2. 3x3 is the default batch
Each batch should target:
grid_layout: 3x3grid_aspect_ratio: inherited project ratio
Pack slots into exact 3x3 batches whenever possible.
Default rule:
- the project ratio comes from the selected type-analysis option
grid_aspect_ratiomust match that inherited ratio- do not hard-code
9:16
If the scene contains start_end_to_video shots, their extra end-frame slots are part of the same sequence and should be included before considering any filler shot.
3. Each panel stays short, but natural
Each prompt_text should usually stay within about 20-35 English words.
It should read like a tiny cinematic action beat, not like a rigid tag chain.
4. Reference images must be explicit in JSON, not spammed in every prompt
The JSON output must include structured reference image paths.
But inside each prompt_text, do not mechanically repeat 图1 / 图2 / 图3 unless it is truly necessary to disambiguate.
This model has semantic understanding.
Reference image paths must match the actual asset files on disk.
Do not blindly switch .jpg to .png or the reverse.
5. Global constraint is mandatory
Every batch must include a strong shared constraint string covering:
- borderless seamless grid
- no black border
- no black frame
- no panel line
- no subtitles
- no watermark
- no timecode
- stable light / color / contrast from
StyleDesign - preserve axis and continuity from the established scene references
6. Video routing must flow through this stage
If a shot in shot-design.json is marked:
video_generation_mode: image_to_video
then this stage should emit one slot:
slot_type: first_frame
If a shot is marked:
video_generation_mode: start_end_to_video
then this stage should emit two slots in story order:
slot_type: first_frameslot_type: last_frame
The end frame is not optional in that case, because downstream video generation needs a real image anchor.
7. Frame registry is required
The JSON output must also include a machine-readable frame_registry section.
For each shot_id, it should record:
video_generation_modefirst_frame.batch_idfirst_frame.panel_indexfirst_frame.expected_frame_imagelast_frame.batch_idlast_frame.panel_indexlast_frame.expected_frame_image
if a last frame exists.
This registry is the handoff contract to veo-video-prompt.
The top-level JSON should also explicitly restate:
ratio_designstyle_design
so downstream stages do not need to guess which project branch was selected.
8. Final batch should avoid empty panels
If all required first_frame and last_frame slots are packed and the final 3x3 batch is still short, you may add a continuity_hold slot as a last resort.
Rules:
- it must come from an existing shot already present in
shot-design.json - it must be derived from that shot's existing first-frame or last-frame state
- it must not invent a new beat, new action, or new shot
- it must not change
video_generation_mode - it exists only to avoid an empty ninth panel in the final grid page
Prompt Compression Rules
Each prompt_text should contain only what is necessary:
- shot size
- the visual action beat
- who or what dominates the frame
- optional camera height or angle if important
- short atmosphere phrase if helpful
Avoid:
- rigid
图1 图2 图3repetition in every slot - pseudo-Qwen coordinate language
- over-specified mechanical geometry when the model can infer the shot from story semantics
- long explanation paragraphs
Validation Checklist
- batch layout is always
3x3 - every batch has
reference_images - every batch has
global_constraint - every batch has inherited
grid_aspect_ratio start_end_to_videoshots have both first-frame and last-frame slotsframe_registryexists and matches the slot packing- any
continuity_holdslot is derived from an existing shot state and does not alter video routing - every slot has
slot_index - every slot has
source_shot_id - every slot has
slot_type - every slot has short English
prompt_text - no batch exceeds 9 slots
- top-level
ratio_designandstyle_designmatch the selected type-analysis option