agentgym-run

star 5

Launch an AgentGym RL training session on this node using the 4090 field runner config (pmoves/configs/agentgym/field-runner-4090.yaml). Agent model: Qwen 3.5 9B via TensorZero (localhost:3030). Publishes episode results to agentgym.episode.completed.v1 on NATS. Environments: BabyAI, TextCraft, Maze, Wordle (lightweight); ALFWorld, SQLGym, WebShop (moderate). Max 300s/15 rounds, concurrency=1. Shift Crew style: NATS event publishing after each episode.

POWERFULMOVES By POWERFULMOVES schedule Updated 5/18/2026

name: agentgym:run description: > Launch an AgentGym RL training session on this node using the 4090 field runner config (pmoves/configs/agentgym/field-runner-4090.yaml). Agent model: Qwen 3.5 9B via TensorZero (localhost:3030). Publishes episode results to agentgym.episode.completed.v1 on NATS. Environments: BabyAI, TextCraft, Maze, Wordle (lightweight); ALFWorld, SQLGym, WebShop (moderate). Max 300s/15 rounds, concurrency=1. Shift Crew style: NATS event publishing after each episode. disable-model-invocation: true

agentgym:run — AgentGym RL Field Runner (4090)

Launches an AgentGym reinforcement learning session using the 4090 node configuration.

Pre-flight

# Verify TensorZero is running (required for Qwen 3.5 9B)
curl -sf http://localhost:3030/health && echo "TensorZero: OK" || echo "TensorZero: DOWN"

# Verify NATS is reachable (for episode event publishing)
curl -sf http://localhost:8222/healthz && echo "NATS: OK" || echo "NATS: unreachable"

# Check Ollama fallback available
ollama list | grep qwen3 || echo "Ollama qwen3 models not found"

Run — Single Episode

The current field runner (pmoves/tools/agentgym_field_runner.py) runs ONE episode per invocation; batch by looping in shell. CLI accepts --run <env>, --config <profile>, --dry-run, --list-envs.

# Validate config + env connectivity without running an episode
python pmoves/tools/agentgym_field_runner.py --dry-run --run BabyAI

# BabyAI (fastest, < 1GB RAM, no external deps)
python pmoves/tools/agentgym_field_runner.py --run BabyAI

# Wordle (good for language model eval)
python pmoves/tools/agentgym_field_runner.py --run Wordle

# ALFWorld (embodied task planning, ~2GB RAM)
python pmoves/tools/agentgym_field_runner.py --run ALFWorld

# List configured environments with health status
python pmoves/tools/agentgym_field_runner.py --list-envs

Run — Full Lightweight Battery

Until a make agentgym-run-lightweight wrapper lands, batch via shell:

for env in BabyAI TextCraft Maze Wordle; do
  python pmoves/tools/agentgym_field_runner.py --run "$env"
done

TODO (follow-up PR): Wrap as make -C pmoves agentgym-run ENV=X + make -C pmoves agentgym-run-lightweight Make targets so this skill routes through the canonical with-env.sh chain. Also: add a --repeat N flag to the runner for episode batching without shell looping.

NATS Events Published

Subject When Payload
agentgym.episode.completed.v1 After each episode episode_id, env, reward, rounds, duration_s
agentgym.eval.batch.completed.v1 After full batch batch_id, env, n_episodes, mean_reward
agentgym.field.status.v1 On runner start/stop node, config, status

Environment Reference

Name Category RAM External Deps Port
BabyAI lightweight < 1GB None 36001
TextCraft lightweight < 1GB None 36002
Maze lightweight < 1GB None 36003
Wordle lightweight < 1GB None 36004
ALFWorld moderate 1-4GB ALFWorld data 36005
SQLGym moderate 1-4GB SQLite 36006
WebShop moderate 1-4GB Web scraper 36007

Skipped on 4090: WebArena (needs full browser), SciWorld (needs Java)

Notes

  • Concurrency is 1 — episodes run sequentially (16GB VRAM budget)
  • Model: qwen35_9b (registered in tensorzero.toml) with qwen3:8b Ollama fallback
  • Max episode: 300s / 15 rounds — hard limits in field-runner-4090.yaml
  • Capture results for AGNOTE4482 handoff: episode JSON written to pmoves/data/agentgym/runs/ (a make -C pmoves agentgym-results summary target is a pending follow-up)
  • See: pmoves/configs/agentgym/field-runner-4090.yaml for full config
  • See: pmoves/services/agentgym-rl-coordinator/ for the runner service
Install via CLI
npx skills add https://github.com/POWERFULMOVES/PMOVES.AI --skill agentgym-run
Repository Details
star Stars 5
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
POWERFULMOVES
POWERFULMOVES Explore all skills →