add-reward-model

star 3.7k

Use when adding reusable reward models under fastvideo/train/methods/rl/rewards for RLHF or online RL training.

hao-ai-lab By hao-ai-lab schedule Updated 6/12/2026

name: add-reward-model description: Use when adding reusable reward models under fastvideo/train/methods/rl/rewards for RLHF or online RL training.

Add Reward Model

Use for reward models consumed by RL methods.

Placement

  • Put reusable reward code under fastvideo/train/methods/rl/rewards/.
  • Expose public builders from fastvideo/train/methods/rl/rewards/__init__.py.
  • Keep method-specific aggregation or advantage logic out of reward classes.

Media Inputs

  • Reward callables receive decoded media tensors.
  • Accept single-frame tensors as [B, C, H, W] and multi-frame tensors as [B, C, T, H, W] when practical.
  • Frame selection is reward-specific. Frame scorers such as PickScore and CLIPScore should explicitly select frame 0; temporal rewards should inspect whichever frames they need.
  • Return one scalar reward per prompt/sample.

Attribution

  • If code is ported or closely adapted from another repo, add a short comment or docstring naming the source file/function.
  • Preserve SPDX headers used by FastVideo files.

Tests

  • Unit-test tensor layout handling without loading large reward checkpoints.
  • Allow fake scorer injection for multi-reward tests.
  • Test weighted reward aggregation and metric keys.
Install via CLI
npx skills add https://github.com/hao-ai-lab/FastVideo --skill add-reward-model
Repository Details
star Stars 3,719
call_split Forks 362
navigation Branch main
article Path SKILL.md
More from Creator