name: xai
description: xAI (Grok Imagine) video generation in Nebula — text-to-video and image-to-video short clips with duration, aspect-ratio, and resolution control via the Grok Imagine Video model. Activate when the user configures the grok-imagine-video node or asks about xAI / Grok / Grok Imagine in Nebula. Sourced from the official xAI docs (docs.x.ai — Imagine video generation reference) and the Nebula audit guide docs/api-guides/xai.md on 2026-06-04.
xAI (Grok Imagine) Skill
xAI's Grok Imagine API turns a text prompt (or a starting image) into a short generated video clip. Nebula wires up exactly one node from this provider — Grok Imagine Video (grok-imagine-video) — covering text-to-video and image-to-video. Everything else the Grok Imagine API offers (image gen/edit, reference-to-video, video edit/extend, voice/audio, chat) is reachable with the same key but is not exposed as a Nebula node yet (see Capability boundaries).
When to use
- User configures the
grok-imagine-videonode (display name: Grok Imagine Video). - User wants a short generated video clip from a text prompt (text-to-video).
- User wants to animate a starting image — "bring a still to life" (image-to-video).
- User asks about xAI / Grok / Grok Imagine video inside Nebula, or about
duration,aspect_ratio, orresolutionfor that node. - User asks why some Grok capability (image gen, voice, video extend, reference images) isn't available — point them at Capability boundaries.
Universal rules
- Auth.
Authorization: Bearer <XAI_API_KEY>header +Content-Type: application/json. The backend reads theXAI_API_KEYenv var (set in the backend.env, then restart the backend). The same xAI key works for video, image, voice, and chat. Missing key → the node fails immediately withXAI_API_KEY is required. - Base URL.
https://api.x.ai/v1. Submit endpoint:POST /v1/videos/generations. Poll endpoint:GET /v1/videos/{request_id}. - Execution pattern: async submit-then-poll (
executionPattern: "async-poll"in the node def). Submit returns{"request_id": "..."}; the handler then pollsGET /v1/videos/{request_id}every 3 s. AProgressEventadvances the node's progress bar each poll. Terminal statedonereturns{"status": "done", "video": {"url": "..."}}, the MP4 is downloaded to the run dir, and thevideoport emits the local path. Set expectations in minutes, not seconds. The poll cap is 300 attempts × 3 s ≈ 15 minutes before the handler raisesGrok timed out(the guide's "up to a few minutes" is the typical case, not the ceiling). - Model is fixed. The handler always sends
"model": "grok-imagine-video". The user does not pick a model — there is no model param on the node. - Status / error codes.
- Submit must return
200,201, or202; anything else raisesGrok submit failed (<status>): <body>. - Submit response missing
request_idraisesGrok returned unexpected response. - Poll must return
200; otherwiseGrok poll failed (<status>): <body>. - Poll
status: "failed"or"expired"raisesGrok failed: <error.message>. status: "done"with novideo.urlraisesGrok completed but no video URL.- 401 on submit/poll → bad or missing
XAI_API_KEY.
- Submit must return
- Input-URI rules (the
imageport, for image-to-video). The handler accepts either form:- An
http://orhttps://URL → passed straight through asbody["image"]. - A local file path → read and inlined as a base64 data URI (
data:<mime>;base64,...). MIME is inferred from the extension:.png→image/png,.jpg/.jpeg→image/jpeg, anything else defaults toimage/png. Wiring an upstream image node into the port (which yields a local path) is the normal flow and works automatically.
- An
- Key gotchas.
promptis required even for image-to-video — an empty/missing prompt raisesPrompt is required. For i2v, give it a short motion prompt (e.g. "gentle wind, the cape flutters, embers rise"), not a full scene description.- The
imageparam the node exposes is the first-frame image (i2v). It is not the API'sreference_images(reference-to-video guides without forcing the first frame) — that's a different, unexposed capability. - 720p and longer durations cost more (per-second pricing). For cheap/fast drafts use
480p+ shortduration. - Outputs are MP4. The downloaded clip lands in the run dir as
<hex>.mp4and is served back to the canvas.
Pick the right node
xAI exposes a single node in Nebula. Use it for both text-to-video (prompt only) and image-to-video (prompt + image).
| Node (display name) | Node ID | Category | Endpoint / Model | Key inputs | Key params |
|---|---|---|---|---|---|
| Grok Imagine Video | grok-imagine-video |
video-gen |
POST https://api.x.ai/v1/videos/generations · model fixed to grok-imagine-video |
prompt (Text, required); image (Image, optional — first frame for i2v) |
duration (int, 1–15, default 5); aspect_ratio (enum of 7, default 16:9); resolution (480p/720p, default 480p) |
Output port: video (Video) — a local MP4 path.
Param reference
grok-imagine-video
Inputs (ports):
| Port | Data type | Required | Notes |
|---|---|---|---|
prompt |
Text | yes | Scene + camera move + lighting for t2v; a short motion description for i2v. Required in both modes. |
image |
Image | no | Present → image-to-video (this image is the first frame). Absent → text-to-video. Accepts an http(s) URL or a local path (auto-converted to a base64 data URI). |
Params (set on the node):
| Param key | Type | Default | Range / Enum | Notes |
|---|---|---|---|---|
duration |
integer | 5 |
1–15 (seconds) |
Longer = more cost/time. Only sent if set. |
aspect_ratio |
enum | 16:9 |
16:9, 9:16, 1:1, 4:3, 3:4, 3:2, 2:3 |
7 options, widescreen through vertical. Use 9:16 for reels/shorts. |
resolution |
enum | 480p |
480p, 720p |
480p = cheapest/fastest draft; 720p costs more. |
Output (port):
| Port | Data type | Notes |
|---|---|---|
video |
Video | Local path to the downloaded .mp4. Renders inline on the canvas and can feed downstream video-consuming nodes. |
Recipes
All recipes use the real node id grok-imagine-video.
Text-to-video from scratch.
- Add a Text input node → type a prompt like "a neon koi fish drifting through a rainy Tokyo alley at night, cinematic, slow dolly-in."
- Wire it into the
promptport ofgrok-imagine-video. - Set
duration8,aspect_ratio16:9,resolution720p. - Run → the
videooutput is an MP4 you can preview, download, or feed downstream. Expect a wait measured in minutes.
Animate a generated image (image-to-video).
- Generate a still with any Nebula image node (e.g. a
gpt-image-2-*node, ageminiimage node, or a FAL image model). - Wire that node's image output into the
imageport ofgrok-imagine-video, and a short motion prompt intoprompt(e.g. "gentle wind, the cape flutters, embers rise"). - Choose
aspect_ratio9:16for a vertical/social clip. Run. - Note: the wired image becomes the first frame of the clip.
- Generate a still with any Nebula image node (e.g. a
Quick social-vertical draft (cheapest/fastest).
- Text node → prompt →
grok-imagine-videowithaspect_ratio9:16,duration5,resolution480p. - Use the resulting vertical
videofor a reel/short, then re-run at720ponce the motion looks right.
- Text node → prompt →
In the nebula_nodes context
- Node id:
grok-imagine-video(categoryvideo-gen,apiProvider: "xai"). - Handler file:
backend/handlers/grok_video.py(handle_grok_video). - Auth env var:
XAI_API_KEY(envKeyNamein the node def). - Endpoints: submit
POST https://api.x.ai/v1/videos/generations; pollGET https://api.x.ai/v1/videos/{request_id}. - Input ports:
prompt(Text, required),image(Image, optional → triggers i2v). - Output port:
video(Video, a local MP4 path). - Chaining rules.
- Feed
promptfrom a Text node (or any node whose output is Text). - Feed
imagefrom any upstream image node (gpt-image-2, gemini, FAL image, etc.) — the handler converts the resulting local path to a base64 data URI automatically; public http(s) URLs are passed through unchanged. - The
videooutput can chain into any downstream node that consumes a Video.
- Feed
- How outputs render. On
done, the handler downloadsvideo.urlinto the run dir (<hex>.mp4) and emits{"video": {"type": "Video", "value": "<path>"}}; the canvas plays the MP4 inline.
Capability boundaries (what the Grok Imagine API can do that Nebula does NOT expose)
Do not promise these through Nebula — there is no node for them today (all reachable with the same XAI_API_KEY, but absent from the canvas). Source: the gap table in docs/api-guides/xai.md.
- Reference-to-video (
reference_imagesarray) — guide a video with reference images without forcing the first frame. Distinct from the node'simageport (which forces the first frame). No input port for it. - Video editing (
POST /v1/videos/edits) — restyle/modify an existing video with a prompt while keeping the scene. No node. - Video extension (
POST /v1/videos/extensions) — continue an existing clip from its last frame. No node. - Image generation (
POST /v1/images/generations) — Grok Imagine image models (grok-imagine-image,grok-imagine-image-quality). No node. (Use another provider's image node, e.g. gpt-image-2 / gemini / FAL.) - Image editing + multi-image compositing (
POST /v1/images/edits, up to 3 source images) — merge subjects / transfer style / compose scenes. No node. - Voice / audio on the same key — Text-to-Speech, Speech-to-Text, Realtime Voice. No node.
- Text / chat (Grok 4.x via
POST /v1/chat/completionsorPOST /v1/responses, incl. reasoning, function calling, live web/X search, structured outputs) and deferred chat completion (GET /v1/chat/deferred-completion/{id}). No node. - Model choice — the node hard-codes
grok-imagine-video; the user can't select a different Grok model from the node.
Overall Nebula exposes ~20% of the xAI (Grok Imagine) API surface — only video generation, and within that only the text-to-video and image-to-video modes.
Sources
- Imagine overview (image + video + modes) — https://docs.x.ai/developers/model-capabilities/imagine
- Video generation (endpoints, params, modes, statuses) — https://docs.x.ai/developers/model-capabilities/video/generation
- Image generation & editing — https://docs.x.ai/docs/guides/image-generations
- Model catalog (model IDs, modalities, pricing) — https://docs.x.ai/docs/models
- xAI API overview — https://docs.x.ai/docs/overview
- Nebula audit guide —
docs/api-guides/xai.md