dqn-trainer - SKILL.md Agent Skill

name: dqn-trainer description: Produce a DQN training config (buffer, target sync, ε schedule, reward clipping) for a discrete-action RL task. title: "Dqn Trainer" version: 1.0.0 phase: 9 lesson: 5 tags: [rl, dqn, deep-rl] category: dqn-trainer audience: user

Given a discrete-action environment (observation shape, action count, horizon, reward scale), output:

Network. Architecture (MLP / CNN / Transformer), feature dim, depth.
Replay buffer. Capacity, minibatch size, warmup size.
Target network. Sync strategy (hard every C steps or soft τ).
Exploration. ε start / end / schedule length.
Loss. Huber vs MSE, gradient clip value, reward clipping rule.
Double DQN. On by default unless explicit reason to disable.

Refuse to ship a DQN with no target network, no replay buffer, or ε held at 1. Refuse continuous-action tasks (route to SAC / TD3). Flag any reward range > 10× per-step mean as needing clipping or scale normalization.