ai-media-gen

name: ai-media-gen description: Generate raster images and video from prompts via the ai-cli (Vercel AI Gateway). Use when you need a real raster image or video — hero banners, conceptual illustrations, social cards, mockup imagery, or any pixel/motion asset that mermaid, visual-explainer HTML, excalidraw, or ascii-wireframe (all vector) cannot produce. Also use to generate or compare images across models from the terminal. Triggers on "generate an image", "make a hero image", "render a picture/photo", "create a video", "illustration for the doc". allowed-tools: Read, Glob, Grep, Bash

cc generates only vector visuals (mermaid, visual-explainer HTML, excalidraw MCP, ascii-wireframe). This skill closes the raster + motion gap by wrapping the ai CLI (ai-cli, npm bin ai, v0.3.0) behind a cc-aware wrapper: scripts/bin/ai-media.

Always call scripts/bin/ai-media, not raw ai. The wrapper injects a default output path, forces --json, and maps results to a three-state exit code — removing the non-TTY stdout-binary footgun by construction.

When to Use

A raster hero banner or conceptual illustration for a visual-explainer / frontend-design page
A photo/picture/render an agent or user explicitly asks for
Video from a prompt or from a still image
Side-by-side comparison of one prompt across multiple models
A composable media pipeline (image → video, image edit via stdin)

Do NOT use for diagrams, charts, schemas, flowcharts, or tables — those are vector work owned by visual-explainer (HTML/mermaid), mermaid-diagrams, or ascii-wireframe.

Prerequisites

AI_GATEWAY_API_KEY must be set. The canonical location is ~/.env (your ~/.zshrc auto-exports it via set -a; source ~/.env; set +a). If it is unset the wrapper exits 3 with guidance. Verify gateway auth at any time with scripts/bin/ai-media models --type image.

Free tier vs paid (image/video). The CLI's default image model openai/gpt-image-2 requires paid AI Gateway credits and errors on the free tier. Verified free-tier-accessible image model: bfl/flux-2-flex (-m bfl/flux-2-flex, ~9-11s). ai text runs free (openai/gpt-5.5). For image/video on free tier, always pass -m with an accessible model or set AI_CLI_IMAGE_MODEL / AI_CLI_VIDEO_MODEL. Check access per-model with scripts/bin/ai-media models --type image.

Commands

scripts/bin/ai-media image "a sunset over matte-black mountains"   # generate an image
scripts/bin/ai-media video "a slow drone shot of a canyon"         # generate a video
scripts/bin/ai-media text  "summarize this" < notes.txt            # generate text
scripts/bin/ai-media models --type image                           # list image models

Key Flags (passed through to `ai`)

-m, --model <id>     Model id (creator/name or short), comma-separated for multi-model
-o, --output <path>  Output file or dir. Wrapper defaults to docs/diagrams/assets/ if omitted.
-n, --count <n>      Generations per model
-i, --image <path>   Reference image (image cmd) / vision input (text cmd), repeatable
--size <WxH>         Image size      --aspect-ratio <W:H>   Aspect ratio

Output Behavior (important for agents)

The wrapper always writes to a file and prints --json to stdout. Read the artifact path from results[].file — never expect raw binary. Without an explicit -o, output lands in ${AI_CLI_OUTPUT_DIR:-docs/diagrams/assets}/.

{ "elapsed_ms": 3420, "count": 1,
  "results": [ { "index": 1, "model": "openai/gpt-image-2", "success": true, "file": ".../output.png" } ] }

Exit Codes

Code	Class	Meaning
`0`	generation	all results succeeded
`1`	generation	all failed, or `ai` errored / emitted non-JSON
`2`	generation	partial — some succeeded, some failed
`3`	config	`AI_GATEWAY_API_KEY` unset (preflight, before any call)

Branch on 2 to handle partial multi-model/multi-count runs distinctly from total failure.

Piping Patterns

# Summarize piped content (text)
git diff | scripts/bin/ai-media text "write a commit message"

# Image → video pipeline
scripts/bin/ai-media image "a dragon" -o /tmp/d.png && \
  scripts/bin/ai-media video "animate this" -i /tmp/d.png

Embedding into a visual-explainer / frontend-design page

Generate, base64-encode, inline for self-containment:

scripts/bin/ai-media image "isometric matte-navy server rack, single cyan accent" \
  -o /tmp/hero.png --aspect-ratio 16:9
IMG=$(base64 -w 0 /tmp/hero.png)          # Linux; macOS: base64 -i /tmp/hero.png
# <img src="data:image/png;base64,${IMG}" alt="...">
rm /tmp/hero.png

Match the prompt to the page palette and aesthetic direction (style + dominant colors). Specific prompts beat vague ones — "isometric illustration of a message queue, cyan nodes on dark navy" beats "a diagram of a queue".

Cross-References

Need	Skill
The page the image goes into	`visual-explainer`, `frontend-design`
Vector diagrams (do NOT use this skill)	`mermaid-diagrams`, `ascii-wireframe`
Brand logos (not generated)	`svgl`

Source / adoption verdict: docs/recon/vercel-labs-ai-cli.{md,html}.

ai-media-gen

ai-media-gen

When to Use

Prerequisites

Commands

Key Flags (passed through to ai)

Output Behavior (important for agents)

Exit Codes

Piping Patterns

Embedding into a visual-explainer / frontend-design page

Cross-References

Key Flags (passed through to `ai`)