vllm-omni-recipe

star 76

Use when adding a recipe for omnimodal models (text-to-image, text-to-video, text-to-audio, image-to-video, any-to-any, diffusion transformers) to the vLLM recipes repository, or documenting vLLM-Omni deployment

hsliuustc0106 By hsliuustc0106 schedule Updated 3/17/2026

name: vllm-omni-recipe description: Use when adding a recipe for omnimodal models (text-to-image, text-to-video, text-to-audio, image-to-video, any-to-any, diffusion transformers) to the vLLM recipes repository, or documenting vLLM-Omni deployment

vLLM-Omni Recipe Creation

Overview

vLLM-Omni extends vLLM to support non-autoregressive models like Diffusion Transformers (DiT) for omnimodal generation: text-to-image, text-to-video, text-to-audio, image-to-video, and any-to-any generation.

This skill guides creating deployment guides for omnimodal models in the vLLM recipes repository.

When to Use

  • Adding text-to-image, text-to-video, text-to-audio, any-to-any model recipes
  • Documenting Diffusion Transformer (DiT) deployments
  • Creating recipes for hybrid AR + diffusion architectures

Recipe Structure

Every recipe follows this structure. Sections marked ⚪ are optional.

# ModelName Usage Guide

[Introduction with HuggingFace link, architecture description]

## Installing vLLM-Omni
[Version-variable based installation]

## [Modality] Generation
[Python API and CLI examples]

## Recommended Deployment Strategy
[Hardware recommendations by model size]

## Key Parameters
[Generation config table]

## Expected Performance ⚪
[Only if verified measurements available]

## Accuracy Comparison ⚪
[Only if verified measurements available]

## Online Serving ⚪
[If supported]

## Additional Resources
[Model card, examples, related links]

For detailed section templates and code examples, see references/recipe-template.md.

Required Sections

1. Introduction

Include:

  • HuggingFace model link
  • Architecture type (DiT, AR+Diffusion, MoE)
  • Key capabilities and parameters

2. Installing vLLM-Omni

Use version variables:

export VLLM_VERSION="0.16.0"

uv venv
source .venv/bin/activate
uv pip install vllm==$VLLM_VERSION
uv pip install git+https://github.com/vllm-project/vllm-omni.git

Add modality-specific dependencies: pillow/diffusers for image/video, soundfile for audio.

3. Generation Examples

Provide Python API examples for all supported modalities. See references/recipe-template.md for code examples.

4. Recommended Deployment Strategy

Include hardware recommendations table with:

  • Model sizes and variants
  • Recommended GPU configurations
  • Memory requirements
  • Notes on MoE, batching, etc.

5. Key Parameters Table

Document generation parameters: height, width, num_inference_steps, guidance_scale, negative_prompt, num_frames (video), audio_end_in_s (audio).

Optional Sections

Performance & Accuracy ⚪

Only include if you have verified measurements. Do not fabricate benchmark numbers.

  • Expected Performance: generation time, memory usage on specific hardware
  • Accuracy Comparison: FID/CLIP scores vs Diffusers baseline

Online Serving ⚪

If model supports OpenAI-compatible serving:

vllm serve org/model-name --omni

Cache-DiT Acceleration ⚪

For DiT models that support caching:

omni = Omni(model="org/model-name", cache_backend="cache_dit")

File Naming

  • Directory: {OrgName}/ (e.g., Qwen/, DeepSeek/)
  • File: {ModelName}.md (e.g., Qwen-Image.md)
  • Use underscores for versions: Wan2_2.md or Wan2.2.md

Common Mistakes

Mistake Fix
Placeholder version (0.XX.0) Use $VLLM_VERSION variable
Missing modality dependencies Add soundfile for audio, diffusers for video
Wrong Omni import Use from vllm_omni.entrypoints.omni import Omni
Fabricated benchmarks Only include verified measurements
Missing from README Add to skills index

Checklist

  • Title follows # ModelName Usage Guide format
  • HuggingFace link in introduction
  • Architecture description (DiT, AR+Diffusion, MoE)
  • Installing vLLM-Omni with $VLLM_VERSION
  • Modality-specific dependencies
  • Python API examples for supported modalities
  • Recommended deployment strategy by hardware
  • Key parameters table
  • Performance/accuracy sections (optional, only if verified)
  • Online serving section (if supported)
  • File named correctly
  • README.md updated with new entry

References

Related Skills

  • vllm-omni-contrib: Contributing new models and development workflow to vLLM-Omni
  • For standard LLM/vLLM recipes (autoregressive models), refer to the vLLM recipes repository for examples
Install via CLI
npx skills add https://github.com/hsliuustc0106/vllm-omni-skills --skill vllm-omni-recipe
Repository Details
star Stars 76
call_split Forks 24
navigation Branch main
article Path SKILL.md
More from Creator
hsliuustc0106
hsliuustc0106 Explore all skills →