sota-improvement

star 0

**Use when:** Agent self-improvement cycles, scheduled maintenance windows, after detecting multiple failures (auto-trigger), user requests "make yourself better" or "run improvement loop", or when system is stable and time for growth. **Don't use when:** User has specific task (not exploration), simple queries, general coding work, or when user asks for help with specific problem (use debugging/coding skills instead). **Outputs:** Fixed failures, integrated capabilities, new beliefs, updated workflows, research summaries, experience captures.

MillionthOdin16 By MillionthOdin16 schedule Updated 2/20/2026

name: sota-improvement description: | Use when: Agent self-improvement cycles, scheduled maintenance windows, after detecting multiple failures (auto-trigger), user requests "make yourself better" or "run improvement loop", or when system is stable and time for growth. Don't use when: User has specific task (not exploration), simple queries, general coding work, or when user asks for help with specific problem (use debugging/coding skills instead). Outputs: Fixed failures, integrated capabilities, new beliefs, updated workflows, research summaries, experience captures. metadata: {"openclaw":{"emoji":"๐Ÿ”„","category":"meta"}} read_when: - User requests self-improvement or maintenance - Scheduled improvement window opens - Multiple failures detected (3+ in 24h) - User specifies END_TIME for improvement - NOT for specific tasks or general queries

SOTA-Driven Self-Improvement

Execute continuous improvement cycles until END_TIME. Two modes:

Mode A (Reactive): Fix failures from logs Mode B (Proactive): Research improvements, find underutilized capabilities

Quick Start

# Check mode
fail-query --recent --count 5
# If failures โ†’ Mode A, else โ†’ Mode B

# Research (MANDATORY: use skills)
# Mode A: exa "[problem] solutions"
# Mode B: exa "agent [topic] SOTA 2025"

Mode Selection

1. Check recent failures:
   - fail-query --recent --count 10
   - ~/clawd/memory/networks/scripts/session-report.sh | grep -A 5 "Failures"

2. If failures exist โ†’ Mode A
3. If no failures โ†’ Mode B

Mode A: Failure-Driven

See: mode-a-template.md

Quick steps:

  1. Scan failures โ†’ Select highest-impact
  2. Prioritize โ†’ Score = (impact ร— feasibility) / time
  3. Check existing โ†’ grep scripts/, skills/, qmd search
  4. Research SOTA โ†’ exa "[problem] prevention patterns"
  5. Fix minimally โ†’ <200 lines, single solution
  6. Verify โ†’ Recreate failure, confirm fixed
  7. Prove โ†’ Demo or test passes
  8. Integrate โ†’ Workflow + docs + beliefs

Mode B: Exploration-Driven

See: mode-b-template.md

Quick steps:

  1. Audit โ†’ Check capability usage, workflow gaps, knowledge gaps
  2. Research frontier โ†’ exa "agent [topic] SOTA 2025"
  3. Identify gap โ†’ Current vs. SOTA approach
  4. Check existing โ†’ Already have this capability?
  5. Implement โ†’ Prototype or integrate
  6. Verify โ†’ Before/after metrics
  7. Prove โ†’ Benchmark or demo
  8. Integrate โ†’ Workflow + docs + beliefs + experience

Research Requirements

PRIMARY: exa skill (neural search)

exa: "[topic] implementation patterns"
exa: "[topic] github examples"
exa: "[topic] research 2024 2025"

FALLBACK: perplexity skill (AI + citations)

perplexity: "What are SOTA approaches for [topic]?"

LOCAL ONLY (degraded mode)

Only if both skills unavailable
Log skill failure for investigation

Constraints

  • END_TIME: Stop regardless of progress
  • YAGNI: If not needed THIS session, don't build
  • Integration-first: Check existing before creating
  • Max complexity: +200 lines or 1 file per iteration
  • Research quality: exa โ‰ฅ80% target

Synthesis Template

[SYNTHESIS] <time> | <mode: A/B> | <focus> | ComplexityDelta: <+X lines> | Status: <VALIDATED/FAILED>

Mode: [Failure/Exploration]
What Changed: [1-2 sentences]
Key Insight: [what you learned]
Research Source: [exa/perplexity/local]
Tool Now Available: [if applicable]

Success Metrics

Metric Target Mode
Failures resolved โ‰ฅ1 per session A
Capabilities added โ‰ฅ1 per 2 sessions B
Integration rate โ‰ฅ30% A+B
Research quality exa โ‰ฅ80% A+B
Beliefs updated โ‰ฅ1 per session B

Full Protocol

For complete protocol with all steps: protocol-v2.2.md

Example Session

See: session-2026-02-20.md

Background Process Management

If long-running process needed:

  • Run with timeout and yieldMs
  • Check status each iteration
  • Stuck >15 min โ†’ kill and defer

Why Two Modes?

Mode A Mode B
Fix broken Improve working
Restore function Enhance function
As needed When stable

Both needed. Reactive maintains, proactive grows.

Install via CLI
npx skills add https://github.com/MillionthOdin16/twin-messages --skill sota-improvement
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
MillionthOdin16
MillionthOdin16 Explore all skills →