name: experiment-loop description: Run the full experiment lifecycle — design, sanity check, full sweep, synthesize, and claim review. Given a hypothesis, proposes sweep matrix, runs the experiment, generates artifacts, and updates the research log. version: "1.0" metadata: author: swarm-research-os generated_from: research-os-v0.1 allowed-tools: Read Write Edit Glob Grep Bash
EXECUTE NOW
Hypothesis: $ARGUMENTS
If no hypothesis provided, check memory at .letta/memory/threads/current.md for the active thread.
Step 1: Design
- State the hypothesis as a testable proposition
- Identify which governance knobs and scenario parameters are relevant
- Propose the sweep matrix:
- Parameter(s) to sweep and their values
- Number of seeds (10 for exploratory, 50 for publication)
- Epochs and steps per epoch
- Check if a similar scenario already exists in
scenarios/:- If yes, propose modifications
- If no, draft a new scenario YAML
- Define success/failure metrics before running
Present the design and wait for user approval before proceeding.
Step 2: Sanity check
Run a short version first:
python -m swarm run scenarios/<name>.yaml --seed 42 --epochs 10 --steps 10
Verify:
- No crashes or exceptions
- Metrics are in expected range
- The scenario tests what we think it tests
If sanity check fails, diagnose and fix before proceeding.
Step 3: Full run
Run the full experiment:
python -m swarm sweep scenarios/<name>.yaml --seeds <N>
Or for single runs:
python -m swarm run scenarios/<name>.yaml --seed 42 --epochs <E> --steps <S>
After completion:
- Verify
run.yamlwas generated in the output directory - Log to SQLite: check if
/log_runshould be invoked - Copy run folder to swarm-artifacts if publication-quality
Step 4: Synthesize
- Run the synthesize skill on the completed run
- Review claim update recommendations
- Present findings to user
Step 5: Update memory
Append to .letta/memory/threads/research-log.md:
## {date} — {hypothesis short name}
**Ran:** {experiment description}
**Found:** {key results with effect sizes}
**Learned:** {what this changes about our understanding}
**Next:** {what to investigate next}
**Run pointers:** {run_id}
Update .letta/memory/threads/current.md with next steps.
Update .letta/memory/runs/latest.md with the new run pointer.