visual-storyboard

star 0

Generate a 4K visual storyboard sheet from a story idea + optional character references. Composes the image-generation prompt (a single multi-section block tuned for Nano Banana Pro / GPT Image 2) and calls `generate_image` to actually produce the sheet — the output is the rendered image, never a raw prompt for the user to copy. Use whenever the user wants a storyboard, a storyboard sheet, a visual story breakdown, a panel-by-panel scene layout, a "storyboard prompt", "break this story into panels", "storyboard for…", "panel layout", or uploads character references and asks for a storyboard. Trigger when the user pairs "storyboard" with any image / video generation tool (Nano Banana Pro, GPT Image, Midjourney, DALL-E, Seedance, Kling, Sora, Veo, Runway, Luma, Hailuo, Wan, Higgsfield, Flux). Works for any visual style — 3D animation, live-action, anime, 2D animation, stop-motion, editorial, comic book, or any other aesthetic. For animating the approved sheet into the final video (per-clip prompt + rendering +

koi-language By koi-language schedule Updated 6/10/2026

name: visual-panels description: > Render a 4K VISUAL PANEL SHEET (a single composite IMAGE: a clean full-bleed grid of numbered panels, each cell is the panel image edge-to-edge plus a plain corner number) by composing the image-generation prompt (a single multi-section block tuned for Nano Banana Pro / GPT Image 2) and calling generate_image. The deliverable is the rendered image, never a raw prompt for the user to copy. This skill does NOT author or plan a story: the editable SOURCE storyboard is the interactive-storyboard JSON, and this skill renders that finished JSON into a visual sheet. THE CYCLE is always: interactive storyboard → visual panels → video — do NOT skip the interactive storyboard. If the user wants panels (or a video) for a multi-shot story but there is NO storyboard yet, author the interactive-storyboard FIRST; render straight from a bare idea (no storyboard) ONLY when the user EXPLICITLY asks for just the image/sheet and not a storyboard. Use ONLY when the user explicitly wants the VISUAL / IMAGE output — a "panel sheet", "visual panels", "panel layout", "render the storyboard as an image", "break this story into panels", panels they can SEE — or when they pair "storyboard" with an image / video generation tool (Nano Banana Pro, GPT Image, Midjourney, DALL-E, Seedance, Kling, Sora, Veo, Runway, Luma, Hailuo, Wan, Higgsfield, Flux), or upload character references for a visual. Do NOT use this skill for a bare "make me a storyboard" / "shot list" / "scene plan" with no mention of images — that is the interactive JSON (interactive-storyboard), not this skill. Works for any visual style — 3D animation, live-action, anime, 2D animation, stop-motion, editorial, comic book, or any other aesthetic. To animate the approved sheet into the final video (per-clip prompt + rendering + timeline assembly), see visual-panels-to-video.

Visual Panels

Turn a story idea + character references into a 4K visual panel sheet: a single composite image, a clean full-bleed grid of numbered panels (each cell is the panel image edge-to-edge plus a plain number in the top-left corner).

This skill is end-to-end — it composes the image prompt AND calls generate_image to render the sheet. The deliverable is the rendered image, NOT a prompt string for the user to copy elsewhere.

⛔ Style: USE what the user already gave you — ask ONLY when it's genuinely missing

The visual style is a USER decision — NEVER infer it from the brand, topic, or vibe (a Chanel storyboard is not automatically "live-action luxury"; a Pokémon one not "anime"; a kids' product not "3D family-film"). But "don't infer" does NOT mean "always ask". First GATHER the style the user already provided; ask ONLY if none exists. Re-asking for a style the user already gave is its own reported bug ("¿por qué me vuelve a preguntar el estilo si ya se lo dije / si ya está en el storyboard?").

GATHER the style from ALL of these before deciding — use the first that is explicit:

  1. Anything the user said in THIS conversation — not just their first/latest message. "en estilo anime", "fotorrealista", "como una peli", "el mismo estilo de antes" all count, wherever they said it.
  2. The source interactive storyboard's stylePrompt — when NON-EMPTY it IS an explicit choice the user made upstream (the visor's style field). USE it; do NOT re-ask. (An EMPTY stylePrompt is the only "no choice" case → then style is genuinely missing.)
  3. The brief / # WORKING AREA context the user pointed you at.

If an explicit style exists in ANY of the above → proceed with it, NO form. Raise the prompt_form style picker (Step 1 of STORYBOARD_ANATOMY.md: the 3 presets + custom + optional character-ref pickers + notes) ONLY when NONE of the sources carry an explicit style.

Same rule for character references — pull them from the storyboard's references / the user's attachments; do NOT re-ask for a photo you already have (see INPUTS). General principle: STOP asking for anything you can already read from the conversation, the storyboard JSON, or the working area — ask only the genuine unknowns.

This skill's reference files (read them, don't paraphrase from memory)

The authoritative specs live in this skill's own references/ directory. When the skill is activated, the runtime returns the skill's absolute directory plus a resources list — read each file from <that directory>/references/<file> (or list_skills → this skill's directory). NEVER hardcode ~/.koi/skills/...; in a dev checkout the skill resolves to the plugin repo path, so always use the activation-returned directory.

  • references/STORYBOARD_ANATOMY.md — the authoritative Phase 1 spec. The 6 steps (gather inputs → analyse references → break the story into beats → compose the prompt with sections A–H → call generate_image at 4K → companion note), the grid chooser, the per-section prompt template, the length targets, and the handling-variations table. Read this first before writing any prompt; the SKILL.md you're reading right now is just the entrypoint.
  • references/STYLE_PRESETS.md — the 3 official visual style presets (Premium 3D / Claymation / Realistic UGC) with ready-to-paste phrasing blocks for section B of the prompt, plus the custom-style flow for anything else (anime, live-action, watercolor, cyberpunk, …). POV is not a style here — it's a per-shot camera angle that combines with any of the styles.
  • references/VIDEO_TYPE_<TYPE>.md — five per-type spec files (ad / explainer / tutorial / demo / social-post). Read ONLY the one matching the user-named type for its brief-context note (internal — informs the panels, NOT rendered), the caption style, the shot mix and the audio cue. Never read all five. Skip entirely when the user didn't name a video type.
  • Sheet FORMAT lives in the prose of section E. The deliverable is a clean full-bleed grid: each panel's image fills its cell edge-to-edge, panels separated ONLY by thin black gutter lines, a plain number in the top-left corner of each panel, and the grid runs to all four margins. NOTHING else: no title banner, no cards, no drop shadows, no caption bars, no timecodes, no number badges, no footer/legend. See STORYBOARD_ANATOMY.md → section E for the exact wording. (references/LAYOUT_TEMPLATE.png is dead: the format used to be copied from that attached skeleton, do NOT attach it or any other format reference.)

⚠ STAMP THE SOURCE-STORYBOARD METADATA — non-negotiable

Every generate_image call this skill makes MUST carry metadata that declares where the sheet came from. This is what lets the downstream visual-panels-to-video skill recover per-shot durations / dialogue / SFX exactly as the user set them in the visor — without this link the next step has to guess from pixels and may confabulate (the reported bug "de repente cambia de tema, era un viejo con un reloj y dijo que era SOC 2").

Two cases, ONE field. Pick the right one and ALWAYS pass it:

  • Source = interactive storyboard JSON (most common — the user has a storyboard open in the visor and asked to render it visually):

    metadata: {
      sourceStoryboard: "/Users/.../.koi/storyboards/<id>.json",  // ← absolute path
      storyboardPart: K,           // 1-based SHEET index (1 if single-sheet)
      storyboardParts: K_total,    // total SHEETS (1 if single-sheet)
      storyboardShotIds: ["sh1","sh2", …],  // ALL shot ids on THIS sheet (union across its clips)
      // panel→clip map — a sheet can hold several clips; this is what lets
      // visual-panels-to-video render one generate_video per clip from the
      // right panels. clipIndex is GLOBAL (1-based, timeline order across sheets).
      // See references/STORYBOARD_ANATOMY.md → Chunking → Step B.
      clips: [
        { clipIndex: 1, shotIds: ["sh1","sh2"], panels: [1,2,3], durationSec: 12 },
        { clipIndex: 2, shotIds: ["sh3"],        panels: [4,5],   durationSec: 8  }
      ]
    }
    

    You already read_filed the JSON to compose the prompt — its absolute path is what you pass.

  • Source = idea + refs only (no JSON, the user's first message was "hazme un storyboard de X" with refs / notes, never an interactive JSON):

    metadata: {
      storyboardOrigin: "idea"
    }
    

    Explicit declaration that there is no JSON to link to — keeps audit trail clean.

Never call generate_image from this skill without one of those two metadata shapes. Both downstream tools and image-lineage notes depend on this. The runtime now logs a loud warning when this skill's call signature (label: "visual_storyboard") is missing both — don't ignore the warning, fix the call.

High-level flow

  1. Activate this skill. The runtime returns the absolute directory; remember it for the reads below.
  2. read_file references/STORYBOARD_ANATOMY.md. That's the spec — every step you need is there.
  3. read_file references/STYLE_PRESETS.md to grab the phrasing block for the chosen style (or follow the custom-style flow there for anything outside the 3 presets).
  4. read_file references/VIDEO_TYPE_<TYPE>.md IF the user named a video type. Skip otherwise.
  5. Follow STORYBOARD_ANATOMY's 6 steps verbatim: gather inputs → analyse references → break the story into beats → compose the prompt → call generate_image (with resolution: "4k" AND the metadata block above — both mandatory) → show_result + companion note.

That's it. The detail lives in STORYBOARD_ANATOMY.md. Don't re-derive it here.

Pairs with

  • visual-panels-to-video (downstream) — once the user approves the sheet, this skill composes the cinematic per-clip video prompt, renders each clip, and assembles them on a timeline.
Install via CLI
npx skills add https://github.com/koi-language/plugins --skill visual-storyboard
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
koi-language
koi-language Explore all skills →