name: veomni-new-model
description: "Use this skill when adding support for a new model to VeOmni. Covers the full lifecycle: analyzing the HuggingFace model, creating model patches, defining parallel plans, writing configs, integrating with the trainer, and testing. Trigger: 'add model', 'support new model', 'integrate ', 'new model support'."
Before You Start: Create Todos
Use TodoWrite to track all phases:
Phase 1: Analyze HF model -> in_progress
Phase 2: Create model patch -> pending
Phase 3: Define parallel plan -> pending
Phase 4: Write training config -> pending
Phase 5: Integrate with trainer -> pending
Phase 6: Test -> pending
Phase 1: Analyze HuggingFace Model
Identify the model on HuggingFace. Read its
config.json,modeling_*.py, and any processor configs.Determine model category:
- Text-only LLM ->
veomni/models/transformers/<model_name>/ - Vision-Language ->
veomni/models/transformers/<model_name>/+veomni/data/multimodal/ - MoE model -> additional
veomni/distributed/moe/integration - Diffusion model ->
veomni/models/diffusers/<model_name>/ - Omni model ->
veomni/models/seed_omni/
- Text-only LLM ->
Check existing similar models: Find the closest existing model in
veomni/models/transformers/and use it as a reference. E.g., if adding a new Qwen variant, referenceqwen3/orqwen3_vl/.Identify required patches: VeOmni uses a patchgen system (
veomni/patchgen/) to auto-generate model patches from HuggingFace models. Check if a patch spec already exists or if one needs to be created.
Phase 2: Create Model Patch
Create the model directory:
veomni/models/transformers/<model_name>/Required files:
__init__.py— model registration (MODELING_REGISTRY/MODEL_CONFIG_REGISTRY/MODEL_PROCESSOR_REGISTRY)<model_name>_gpu_patch_gen_config.py— declarative patchgen config (replace_class / override_method / replace_function / modify_init / add_post_import_block / drop_import_names) defining all VeOmni patches against the upstream HF modeling<model_name>_npu_patch_gen_config.py— NPU patchgen config (often just imports the GPU config and applies NPU-specific overrides vianame_map)parallel_plan.py— FSDP / TP / EP sharding plangenerated/patched_modeling_<model_name>_{gpu,npu}.py— patchgen output (do NOT edit manually)
Patch patterns — follow existing models:
- Sequence parallel: declare an
OpSlotfor attention/loss and overrideforwardvia patchgen - MoE: stack per-expert weights (
gate_up_proj [E, 2*I, H]/down_proj [E, H, I]) and add aveomni_moe_experts_forwardOpSlot - Cross-entropy: add a
veomni_causal_lm_lossOpSlotand returnCausalLMOutputWithLogProbs - Register the model class in the model package
__init__.py(no entry inveomni/models/auto.pyis needed for transformers models — registration happens via the per-modelMODELING_REGISTRYdecorators)
- Sequence parallel: declare an
Run patchgen:
make patchgenregenerates everygenerated/patched_modeling_*.pyfrom the matching*_patch_gen_config.py.
Phase 3: Define Parallel Plan
Create
parallel_plan.pyin the model directory.Define FSDP/FSDP2 sharding strategy:
- Which layers to wrap (typically transformer blocks)
- Activation checkpointing granularity
- Parameter dtype policies
If the model is MoE, define expert parallelism plan in addition to FSDP.
Reference existing parallel plans for guidance (e.g.,
veomni/models/transformers/qwen3/parallel_plan.py).
Phase 4: Write Training Config
Model config: Create
configs/model_configs/<model_family>/<ModelName>.jsonmatching HuggingFace format.Training config: Create YAML in the appropriate directory:
- Text:
configs/text/<model_name>.yaml - Multimodal:
configs/multimodal/<model_name>/<model_name>.yaml - DiT:
configs/dit/<model_name>.yaml
- Text:
Config must include: model path, data config, optimizer settings, parallelism config, checkpoint settings.
Verify against existing configs — match the structure of similar model configs.
Phase 5: Integrate with Trainer
Verify the model works with the appropriate trainer:
- Text ->
TextTrainer(veomni/trainer/text_trainer.py) - VLM ->
VLMTrainer(veomni/trainer/vlm_trainer.py) - DiT ->
DitTrainer(veomni/trainer/dit_trainer.py)
- Text ->
If the model needs custom data preprocessing:
- Add transform in
veomni/data/data_transform.pyorveomni/data/multimodal/ - Register the transform for the model
- Add transform in
If the model needs custom collator logic:
- Extend
veomni/data/data_collator.py
- Extend
VLM only — multimodal metadata precompute: to keep the ViT forward free of host-device CUDA syncs, derive ViT
cu_seqlens/max_seqlenin the collator rather than the forward. Follow the checklist in.agents/knowledge/multimodal_metadata.md("Adding the hook to a new model"): acollate_multimodal_metadatapatchgen helper + aget_metadata_collate_funcoverride, the per-modalityvit_metadatasub-dict threaded through Model.forward → ViT.forward (with a runtime fallback), and the model added to_MM_METADATA_WIRED_CASESin the sync gate test.
Phase 6: Test
Create toy config: Add
tests/toy_config/<model_name>_toy/config.jsonwith minimal parameters for fast testing.Unit tests: Add tests in
tests/models/to verify:- Model loads correctly via
veomni.models.auto - Forward pass produces correct output shape
- Model patch applies without errors
- Model loads correctly via
E2e tests (if feasible): Test a short training run using the toy config.
Run
make qualityandpytest tests/models/.Update documentation:
- Add usage example to
docs/(training command, config reference). - Update
.agents/knowledge/architecture.mdif the model adds a new module or trainer path. - Update supported models table in project
README.mdif applicable.
- Add usage example to
Common Pitfalls
- Model registry: Registration must happen at import time in
__init__.py. If the model'sAutoConfigtype is not registered,build_foundation_model()will fail. - Generated files: Never edit files in
generated/directories — they are overwritten by patchgen. Edit the matching<model>_{gpu,npu}_patch_gen_config.pyand re-runmake patchgeninstead. - Tokenizer compatibility: Some models require specific tokenizer versions or custom chat templates — verify in
veomni/data/chat_template.py. - Transformers version: All modeling targets
transformers==5.9.0(pinned by thetransformers-stabledefault dependency group). Models register through the patchgen-generated path undergenerated/; do not introduce legacymodeling_<m>.pyfiles orapply_veomni_<m>_patch()helpers.