ai-media-gen

star 0

Generate raster images and video from prompts via the ai-cli (Vercel AI Gateway). Use when you need a real raster image or video — hero banners, conceptual illustrations, social cards, mockup imagery, or any pixel/motion asset that mermaid, visual-explainer HTML, excalidraw, or ascii-wireframe (all vector) cannot produce. Also use to generate or compare images across models from the terminal. Triggers on "generate an image", "make a hero image", "render a picture/photo", "create a video", "illustration for the doc".

leonardoacosta By leonardoacosta schedule Updated 6/3/2026

name: ai-media-gen description: Generate raster images and video from prompts via the ai-cli (Vercel AI Gateway). Use when you need a real raster image or video — hero banners, conceptual illustrations, social cards, mockup imagery, or any pixel/motion asset that mermaid, visual-explainer HTML, excalidraw, or ascii-wireframe (all vector) cannot produce. Also use to generate or compare images across models from the terminal. Triggers on "generate an image", "make a hero image", "render a picture/photo", "create a video", "illustration for the doc". allowed-tools: Read, Glob, Grep, Bash

ai-media-gen

cc generates only vector visuals (mermaid, visual-explainer HTML, excalidraw MCP, ascii-wireframe). This skill closes the raster + motion gap by wrapping the ai CLI (ai-cli, npm bin ai, v0.3.0) behind a cc-aware wrapper: scripts/bin/ai-media.

Always call scripts/bin/ai-media, not raw ai. The wrapper injects a default output path, forces --json, and maps results to a three-state exit code — removing the non-TTY stdout-binary footgun by construction.

When to Use

  • A raster hero banner or conceptual illustration for a visual-explainer / frontend-design page
  • A photo/picture/render an agent or user explicitly asks for
  • Video from a prompt or from a still image
  • Side-by-side comparison of one prompt across multiple models
  • A composable media pipeline (image → video, image edit via stdin)

Do NOT use for diagrams, charts, schemas, flowcharts, or tables — those are vector work owned by visual-explainer (HTML/mermaid), mermaid-diagrams, or ascii-wireframe.

Prerequisites

AI_GATEWAY_API_KEY must be set. The canonical location is ~/.env (your ~/.zshrc auto-exports it via set -a; source ~/.env; set +a). If it is unset the wrapper exits 3 with guidance. Verify gateway auth at any time with scripts/bin/ai-media models --type image.

Free tier vs paid (image/video). The CLI's default image model openai/gpt-image-2 requires paid AI Gateway credits and errors on the free tier. Verified free-tier-accessible image model: bfl/flux-2-flex (-m bfl/flux-2-flex, ~9-11s). ai text runs free (openai/gpt-5.5). For image/video on free tier, always pass -m with an accessible model or set AI_CLI_IMAGE_MODEL / AI_CLI_VIDEO_MODEL. Check access per-model with scripts/bin/ai-media models --type image.

Commands

scripts/bin/ai-media image "a sunset over matte-black mountains"   # generate an image
scripts/bin/ai-media video "a slow drone shot of a canyon"         # generate a video
scripts/bin/ai-media text  "summarize this" < notes.txt            # generate text
scripts/bin/ai-media models --type image                           # list image models

Key Flags (passed through to ai)

-m, --model <id>     Model id (creator/name or short), comma-separated for multi-model
-o, --output <path>  Output file or dir. Wrapper defaults to docs/diagrams/assets/ if omitted.
-n, --count <n>      Generations per model
-i, --image <path>   Reference image (image cmd) / vision input (text cmd), repeatable
--size <WxH>         Image size      --aspect-ratio <W:H>   Aspect ratio

Output Behavior (important for agents)

The wrapper always writes to a file and prints --json to stdout. Read the artifact path from results[].file — never expect raw binary. Without an explicit -o, output lands in ${AI_CLI_OUTPUT_DIR:-docs/diagrams/assets}/.

{ "elapsed_ms": 3420, "count": 1,
  "results": [ { "index": 1, "model": "openai/gpt-image-2", "success": true, "file": ".../output.png" } ] }

Exit Codes

Code Class Meaning
0 generation all results succeeded
1 generation all failed, or ai errored / emitted non-JSON
2 generation partial — some succeeded, some failed
3 config AI_GATEWAY_API_KEY unset (preflight, before any call)

Branch on 2 to handle partial multi-model/multi-count runs distinctly from total failure.

Piping Patterns

# Summarize piped content (text)
git diff | scripts/bin/ai-media text "write a commit message"

# Image → video pipeline
scripts/bin/ai-media image "a dragon" -o /tmp/d.png && \
  scripts/bin/ai-media video "animate this" -i /tmp/d.png

Embedding into a visual-explainer / frontend-design page

Generate, base64-encode, inline for self-containment:

scripts/bin/ai-media image "isometric matte-navy server rack, single cyan accent" \
  -o /tmp/hero.png --aspect-ratio 16:9
IMG=$(base64 -w 0 /tmp/hero.png)          # Linux; macOS: base64 -i /tmp/hero.png
# <img src="data:image/png;base64,${IMG}" alt="...">
rm /tmp/hero.png

Match the prompt to the page palette and aesthetic direction (style + dominant colors). Specific prompts beat vague ones — "isometric illustration of a message queue, cyan nodes on dark navy" beats "a diagram of a queue".

Cross-References

Need Skill
The page the image goes into visual-explainer, frontend-design
Vector diagrams (do NOT use this skill) mermaid-diagrams, ascii-wireframe
Brand logos (not generated) svgl

Source / adoption verdict: docs/recon/vercel-labs-ai-cli.{md,html}.

Install via CLI
npx skills add https://github.com/leonardoacosta/central-claude --skill ai-media-gen
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
leonardoacosta
leonardoacosta Explore all skills →