name: autoresearch description: Autonomous Goal-directed Iteration. Apply Karpathy's autoresearch principles to ANY task. Loops autonomously — modify, verify, keep/discard, repeat forever until stopped. version: 1.0.0
Claude Autoresearch — Autonomous Goal-directed Iteration
Inspired by Karpathy's autoresearch. Applies constraint-driven autonomous iteration to ANY work — not just ML research.
Core idea: You are an autonomous agent. Modify → Verify → Keep/Discard → Repeat. Never stop.
When to Activate
- User invokes
/autoresearchor/ug:autoresearch - User says "work autonomously", "iterate until done", "keep improving", "run overnight"
- Any task requiring repeated iteration cycles with measurable outcomes
Setup Phase (Do Once)
- Read all in-scope files for full context before any modification
- Define the goal — What does "better" mean? Extract or ask for a mechanical metric:
- Code: tests pass, build succeeds, performance benchmark improves
- Content: word count target hit, SEO score improves, readability score
- Design: lighthouse score, accessibility audit passes
- If no metric exists → define one with user, or use simplest proxy (e.g. "compiles without errors")
- Define scope constraints — Which files can you modify? Which are read-only?
- Create a results log — Track every iteration (see
references/results-logging.md) - Establish baseline — Run verification on current state. Record as iteration #0
- Confirm and go — Show user the setup, get confirmation, then BEGIN THE LOOP
The Loop (Runs Forever)
Read references/autonomous-loop-protocol.md for full protocol details.
LOOP FOREVER:
1. Review: Read current state + git history + results log
2. Ideate: Pick next change based on goal, past results, what hasn't been tried
3. Modify: Make ONE focused change to in-scope files
4. Commit: Git commit the change (before verification)
5. Verify: Run the mechanical metric (tests, build, benchmark, etc.)
6. Decide:
- IMPROVED → Keep commit, log "keep", advance
- SAME/WORSE → Git revert, log "discard"
- CRASHED → Try to fix (max 3 attempts), else log "crash" and move on
7. Log: Record result in results log
8. Repeat: Go to step 1. NEVER STOP. NEVER ASK "should I continue?"
Critical Rules
- NEVER STOP — Loop until manually interrupted. User may be asleep.
- Read before write — Always understand full context before modifying
- One change per iteration — Atomic changes. If it breaks, you know exactly why
- Mechanical verification only — No subjective "looks good". Use metrics
- Automatic rollback — Failed changes revert instantly. No debates
- Simplicity wins — Equal results + less code = KEEP. Tiny improvement + ugly complexity = DISCARD
- Git is memory — Every kept change committed. Agent reads history to learn patterns
- When stuck, think harder — Re-read files, re-read goal, combine near-misses, try radical changes. Don't ask for help unless truly blocked by missing access/permissions
Principles Reference
See references/core-principles.md for the 7 generalizable principles from autoresearch.
Adapting to Different Domains
| Domain | Metric | Scope | Verify Command |
|---|---|---|---|
| Backend code | Tests pass + coverage % | src/**/*.ts |
npm test |
| Frontend UI | Lighthouse score | src/components/** |
npx lighthouse |
| ML training | val_bpb / loss | train.py |
uv run train.py |
| Blog/content | Word count + readability | content/*.md |
Custom script |
| Performance | Benchmark time (ms) | Target files | npm run bench |
| Refactoring | Tests pass + LOC reduced | Target module | npm test && wc -l |
Adapt the loop to your domain. The PRINCIPLES are universal; the METRICS are domain-specific.