name: vidu-skills description: Generate video and images by calling the official Vidu API via vidu CLI. Use when the user wants text-to-image, text-to-video, image-to-video, head-tail-image-to-video, reference-to-image, reference-to-video, lip-sync, text-to-speech, video-compose, Create References, or to submit or check Vidu tasks. Requires VIDU_TOKEN and optional VIDU_BASE_URL. version: 1.4.15 homepage: https://www.vidu.cn/ primaryEnv: VIDU_TOKEN metadata: {"openclaw":{"requires":{"bins":["node","npm","vidu-cli"],"env":["VIDU_TOKEN"]},"primaryEnv":"VIDU_TOKEN","install":[{"id":"vidu-cli","kind":"node","package":"vidu-cli","bins":["vidu-cli"],"label":"Install vidu-cli via npm (requires Node.js >=14; postinstall downloads a platform binary from GitHub)"}]}}
Vidu Video and Image Generation Skill
Generate AI videos and images with Vidu via vidu-cli: text-to-image, text-to-video, image-to-video, start/end-frame video, reference-based generation, reusable elements, TTS, lip-sync, and compose.
Execution Model
All execution uses the vidu-cli CLI. Use CLI flags, not raw JSON request bodies. Use argv arrays; do not wrap commands in shell strings.
Vidu creator tasks are async: submit returns task_id; poll with vidu-cli task get <task_id> and download with vidu-cli task get <task_id> -o <dir>.
Environment:
VIDU_TOKEN(required): Vidu API token.VIDU_BASE_URL(optional): defaulthttps://service.vidu.cn; usehttps://service.vidu.comfor overseas.VIDU_DEBUG=1(optional): print full response body to stderr for debugging.
For installation, token setup, and first-run checks, read references/setup.md only when the user is setting up the skill or the CLI is missing.
Decision Table
| User intent | Use this command shape | Notes |
|---|---|---|
| Create reusable subject/character from existing image | vidu-cli element create --name ... --image ... [--description ...] [--style ...] |
Returns id, version; no duration. |
| Generate reference image from prompt/images | vidu-cli task submit --type reference2image --prompt ... --image ... --duration 0 --model-version ... --resolution ... |
Use when no reusable element exists yet. |
| Generate character/reference video | vidu-cli task submit --type character2video --prompt ... (--image ...|--material name:id:version) --duration ... --model-version ... --resolution 1080p |
Image/material total must be 1-7. |
| Generate narration/audio | vidu-cli task tts --prompt ... --voice-id ... [--speed ...] |
Use --voice-id, never --voice. |
| Compose final video | vidu-cli task compose --timeline <json> ... |
Read references/compose.md first. |
| Query/download task | vidu-cli task get <task_id> [-o <dir>] |
--output only belongs on task get. |
Do Not Invent
Never use these:
vidu-cli reference ...vidu-cli reference create-image-referencevidu-cli task submit --type ttsvidu-cli task tts --voice ...vidu-cli task submit --output ...vidu-cli task downloadvidu-cli task submit --material "a:id:ver,b:id:ver"sh -c "vidu-cli ..."
Minimal Rules
- Every async creator command (
task submit,task tts,task compose,task lip-sync) must go throughprovider_batch, even for one item. - Use direct argv arrays; put every flag and value in separate argv entries.
- Repeat
--image,--material,--audio, and--videoonce per item. Do not comma-join. reference2imageandcharacter2videorequire non-empty prompt plus 1-7 total references (--image+--material).- Image tasks use
--duration 0; reusableelement createhas noduration. - For
character2videowith3.2_a, duration must be an explicit4-15; for other models, checkreferences/parameters.md. - Download successful media with
vidu-cli task get <task_id> --output <dir>; whensubtitle_uriis present, this also downloads subtitle JSON. - Report CLI/API errors from JSON fields exactly; do not infer hidden causes.
Common Shapes
vidu-cli element create --name "角色A" --image /path/a.png --description "..." --style "..."
vidu-cli task submit --type reference2image --prompt "..." --image /path/a.png --duration 0 --model-version 3.2_image_2 --resolution 1080p
vidu-cli task submit --type character2video --prompt "[@角色A] walks into office" --material "角色A:ID:VERSION" --duration 5 --model-version 3.2_a --resolution 1080p
vidu-cli task tts --prompt "旁白文本" --voice-id "Chinese (Mandarin)_Gentleman" --speed 1.3
Load References Only When Needed
references/parameters.md: exact model matrix, durations, flags, examples, validation details.references/compose.md: required before building compose timeline JSON.references/errors_and_retry.md: lifecycle states, retry behavior, polling edge cases.references/setup.md: installation, environment variables, first-run checks, and external-user setup notes.
If local CLI help conflicts with this file, trust vidu-cli <subcommand> --help and update this skill afterward.