generate-video - SKILL.md Agent Skill

name: generate-video description: >- This skill should be used when the user asks to "generate a video", "create a video", "make a video", "animate this", "text to video", "generate with Veo", "create a clip", "make a short film", "extend the video", "continue the video", "interpolate between", "image to video", "animate this image", "video from image", or needs AI video generation, video extension, or frame interpolation using the Gemini Veo API. version: 0.1.0

Generate Video

Wrap the Gemini Veo video generation REST API to produce, extend, and interpolate videos via a Python script (stdlib only, no pip dependencies). Support text-to-video, image-to-video, frame interpolation, reference images, video extension, and native audio synthesis. All output is saved to ./generated-videos/ and auto-opened on macOS.

Prerequisites

Before any generation, verify the environment:

Confirm $GEMINI_API_KEY is set. If missing, instruct the user: export GEMINI_API_KEY='your-key-here'
Ensure python3 is available (Python 3.7+). The script uses only stdlib modules — no pip install needed.

Text-to-Video Generation

When the user requests a new video from a text description:

1. Confirm the Prompt

Restate the user's request as a clear generation prompt. If the request is vague, ask for clarification before proceeding. Video prompts work best when they describe:

Camera movement (dolly, pan, tracking, aerial, handheld)
Subject action (what happens over time)
Mood/lighting (golden hour, neon, foggy)
Audio (dialogue, music, sound effects — Veo 3.1 generates native audio)

2. Choose Settings

Use API defaults unless the user explicitly requests specific settings. Only pass flags when the user asks for them.

If the user asks about available options:

Models: veo-3.1-fast-generate-preview (default, fast), veo-3.1-generate-preview (highest quality), veo-3-fast-generate-preview, veo-3-generate-preview
Aspect ratios: 16:9 (landscape, default), 9:16 (portrait/vertical)
Resolutions: 720p (default, fastest), 1080p, 4k (1080p and 4k require 8s duration)
Durations: 4, 6, or 8 seconds (default: 8)
Audio: Native audio is generated by default on Veo 3+ models; use --no-audio to disable
Multiple outputs: Generate 1-4 video variations with --sample-count
Determinism: Use --seed for more reproducible results

3. Compose a Negative Prompt (Optional)

If the user mentions things to avoid, or if quality keywords suggest it, use --negative-prompt. Common exclusions:

"blurry, shaky, low quality" (general quality)
"text overlays, watermark" (clean output)
"cartoon, animation" (when photorealism is desired)

Only add a negative prompt when the user requests it or when it clearly improves the result.

4. Invoke the Script

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "the final prompt" \
  --output-dir "./generated-videos"

Add optional flags only when the user explicitly requests them:

--aspect-ratio "9:16" — for portrait/vertical video
--resolution "1080p" — for higher resolution (requires 8s duration)
--duration "4" — for shorter clips
--negative-prompt "blurry, low quality" — to exclude unwanted elements
--person-generation "allow_all" — when people are needed in the video

Important: Video generation is asynchronous. The script polls automatically and will take 11 seconds to 6 minutes. Warn the user that this takes longer than image generation.

5. Report the Result

The script outputs JSON to stdout. Parse it and report:

The saved video path
The generation time
Note that the video has been opened for preview
Ask if the user wants to extend the video, generate a new one, or adjust settings

Image-to-Video (Animation)

When the user provides an image to animate:

Validate the image file exists (test -f).
The prompt should describe the desired motion, not the image content.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "The camera slowly pulls back as the leaves begin to sway" \
  --image "/path/to/image.png" \
  --output-dir "./generated-videos"

Frame Interpolation

When the user provides two images and wants a transition between them:

Validate both image files exist.
The prompt should describe the desired transition.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "Smooth cinematic transition between the two scenes" \
  --image "/path/to/first-frame.png" \
  --last-frame "/path/to/last-frame.png" \
  --output-dir "./generated-videos"

Reference Images

When the user provides reference images for style or content guidance:

Validate each file exists (up to 3 reference images).
The prompt should explicitly mention the referenced objects.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "A woman walks through a garden wearing the red dress" \
  --reference-image "/path/to/dress.png" \
  --reference-image "/path/to/garden.png" \
  --output-dir "./generated-videos"

Video Extension

When the user wants to extend a previously generated video:

Validate the video file exists (must be an MP4).
The prompt describes what happens next in the video.
Extensions are locked to 720p and add ~7 seconds.
Up to 20 extensions per video chain.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" extend \
  --prompt "The camera continues to pan right, revealing a waterfall" \
  --video "./generated-videos/previous-video.mp4" \
  --output-dir "./generated-videos"

Background Submission

For long-running generations, submit without waiting:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "the prompt" \
  --no-wait \
  --output-dir "./generated-videos"

This returns an operation name. Check later with:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" poll \
  --operation "models/veo-3.1-fast-generate-preview/operations/..." \
  --wait \
  --output-dir "./generated-videos"

Error Handling

Exit Code	Meaning	Action
0	Success	Report video path
10	Missing `$GEMINI_API_KEY` or dependency	Tell user what to set/install
11	Invalid input (bad path, unsupported format, constraint violation)	Report the specific validation error
20	HTTP 400 — content policy or bad request	Show API error message, suggest rephrasing
21	HTTP 401/403 — auth failure	"API key is invalid or expired"
22	HTTP 429 — rate limited	Wait 10 seconds, retry once automatically. If still failing, tell user to wait.
23	HTTP 500+ — server error	Retry once automatically. If still failing, report.
24	Poll timeout	Report operation name so user can check later with the poll command
30	No video in response	"Model didn't return a video — try rephrasing the prompt"

On exit codes 22 and 23, retry the same command once before reporting failure.

Script Reference

`generate_video.py generate`

Core API caller. Flags:

--prompt (required) — generation prompt
--model — model ID (default: veo-3.1-fast-generate-preview)
--aspect-ratio — 16:9 or 9:16 (optional; API default 16:9)
--resolution — 720p, 1080p, 4k (optional; API default 720p)
--duration — 4, 6, 8 seconds (optional; API default 8)
--negative-prompt — content to exclude
--person-generation — allow_all, allow_adult, or dont_allow
--generate-audio / --no-audio — enable/disable native audio synthesis (Veo 3+)
--seed N — seed for deterministic generation (uint32)
--sample-count N — number of videos to generate (1-4)
--resize-mode — pad or crop (image-to-video only)
--compression-quality — optimized or lossless
--image PATH — first frame for image-to-video
--last-frame PATH — last frame for interpolation (requires --image)
--reference-image PATH — reference image (repeatable, max 3)
--no-wait — submit and return immediately
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

`generate_video.py extend`

Video extension. Flags:

--prompt (required) — continuation prompt
--video PATH (required) — MP4 video to extend
--model — model ID (default: veo-3.1-fast-generate-preview)
--no-wait — submit and return immediately
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

`generate_video.py poll`

Check or wait on a running operation. Flags:

--operation NAME (required) — operation name from a previous submission
--wait — wait for completion and download the video
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

Additional Resources

references/api-reference.md — Full Veo REST API schema: endpoints, request/response formats, all parameters, error codes, constraints.
references/advanced-features.md — Prompt engineering, resolution guide, duration selection, reference images, person generation, extension chains, async workflow, SynthID watermark.