vidu-skills

star 49

Generate video and images by calling the official Vidu API via vidu CLI. Use when the user wants text-to-image, text-to-video, image-to-video, head-tail-image-to-video, reference-to-image, reference-to-video, lip-sync, text-to-speech, video-compose, Create References, or to submit or check Vidu tasks. Requires VIDU_TOKEN and optional VIDU_BASE_URL.

shengshu-ai By shengshu-ai schedule Updated 6/5/2026

name: vidu-skills description: Generate video and images by calling the official Vidu API via vidu CLI. Use when the user wants text-to-image, text-to-video, image-to-video, head-tail-image-to-video, reference-to-image, reference-to-video, lip-sync, text-to-speech, video-compose, Create References, or to submit or check Vidu tasks. Requires VIDU_TOKEN and optional VIDU_BASE_URL. version: 1.4.15 homepage: https://www.vidu.cn/ primaryEnv: VIDU_TOKEN metadata: {"openclaw":{"requires":{"bins":["node","npm","vidu-cli"],"env":["VIDU_TOKEN"]},"primaryEnv":"VIDU_TOKEN","install":[{"id":"vidu-cli","kind":"node","package":"vidu-cli","bins":["vidu-cli"],"label":"Install vidu-cli via npm (requires Node.js >=14; postinstall downloads a platform binary from GitHub)"}]}}

Vidu Video and Image Generation Skill

Generate AI videos and images with Vidu via vidu-cli: text-to-image, text-to-video, image-to-video, start/end-frame video, reference-based generation, reusable elements, TTS, lip-sync, and compose.

Execution Model

All execution uses the vidu-cli CLI. Use CLI flags, not raw JSON request bodies. Use argv arrays; do not wrap commands in shell strings.

Vidu creator tasks are async: submit returns task_id; poll with vidu-cli task get <task_id> and download with vidu-cli task get <task_id> -o <dir>.

Environment:

  • VIDU_TOKEN (required): Vidu API token.
  • VIDU_BASE_URL (optional): default https://service.vidu.cn; use https://service.vidu.com for overseas.
  • VIDU_DEBUG=1 (optional): print full response body to stderr for debugging.

For installation, token setup, and first-run checks, read references/setup.md only when the user is setting up the skill or the CLI is missing.

Decision Table

User intent Use this command shape Notes
Create reusable subject/character from existing image vidu-cli element create --name ... --image ... [--description ...] [--style ...] Returns id, version; no duration.
Generate reference image from prompt/images vidu-cli task submit --type reference2image --prompt ... --image ... --duration 0 --model-version ... --resolution ... Use when no reusable element exists yet.
Generate character/reference video vidu-cli task submit --type character2video --prompt ... (--image ...|--material name:id:version) --duration ... --model-version ... --resolution 1080p Image/material total must be 1-7.
Generate narration/audio vidu-cli task tts --prompt ... --voice-id ... [--speed ...] Use --voice-id, never --voice.
Compose final video vidu-cli task compose --timeline <json> ... Read references/compose.md first.
Query/download task vidu-cli task get <task_id> [-o <dir>] --output only belongs on task get.

Do Not Invent

Never use these:

  • vidu-cli reference ...
  • vidu-cli reference create-image-reference
  • vidu-cli task submit --type tts
  • vidu-cli task tts --voice ...
  • vidu-cli task submit --output ...
  • vidu-cli task download
  • vidu-cli task submit --material "a:id:ver,b:id:ver"
  • sh -c "vidu-cli ..."

Minimal Rules

  • Every async creator command (task submit, task tts, task compose, task lip-sync) must go through provider_batch, even for one item.
  • Use direct argv arrays; put every flag and value in separate argv entries.
  • Repeat --image, --material, --audio, and --video once per item. Do not comma-join.
  • reference2image and character2video require non-empty prompt plus 1-7 total references (--image + --material).
  • Image tasks use --duration 0; reusable element create has no duration.
  • For character2video with 3.2_a, duration must be an explicit 4-15; for other models, check references/parameters.md.
  • Download successful media with vidu-cli task get <task_id> --output <dir>; when subtitle_uri is present, this also downloads subtitle JSON.
  • Report CLI/API errors from JSON fields exactly; do not infer hidden causes.

Common Shapes

vidu-cli element create --name "角色A" --image /path/a.png --description "..." --style "..."

vidu-cli task submit --type reference2image --prompt "..." --image /path/a.png --duration 0 --model-version 3.2_image_2 --resolution 1080p

vidu-cli task submit --type character2video --prompt "[@角色A] walks into office" --material "角色A:ID:VERSION" --duration 5 --model-version 3.2_a --resolution 1080p

vidu-cli task tts --prompt "旁白文本" --voice-id "Chinese (Mandarin)_Gentleman" --speed 1.3

Load References Only When Needed

  • references/parameters.md: exact model matrix, durations, flags, examples, validation details.
  • references/compose.md: required before building compose timeline JSON.
  • references/errors_and_retry.md: lifecycle states, retry behavior, polling edge cases.
  • references/setup.md: installation, environment variables, first-run checks, and external-user setup notes.

If local CLI help conflicts with this file, trust vidu-cli <subcommand> --help and update this skill afterward.

Install via CLI
npx skills add https://github.com/shengshu-ai/vidu-skills --skill vidu-skills
Repository Details
star Stars 49
call_split Forks 21
navigation Branch main
article Path SKILL.md
More from Creator