name: self-evolution-reflexion description: "复杂任务后结构化反思,提取教训持久化到记忆/技能。触发词:反思/学到了什么/reflection/复盘/总结教训。基于TMLR 2026自进化Agent综述。" category: software-development tags: [self-evolution, reflection, memory, quality]
Self-Evolution via Reflexion Loop
Full research context:
references/survey-background.md
Trigger: After completing a complex task (5+ tool calls, error recovery, discovering new workflows, user corrections), load this skill to perform a structured reflection.
When to Reflect
Reflect after:
- Tasks requiring 5+ tool calls
- Error recovery (fixing mistakes discovered mid-task)
- User corrections (user had to steer you)
- Discovering a new workflow or tool pattern
- User asks you to "remember how you did that"
Skip for: simple one-shot answers, single tool calls, trivial lookups.
Reflection Protocol (4 Steps)
Step 1: Trajectory Summary
Summarize what happened in 2-3 sentences. What was the goal? What path did you take? Did it succeed?
Step 2: Error/Correction Analysis
If anything went wrong:
- What was the error or correction?
- What was the root cause?
- How was it resolved?
Step 3: Lesson Extraction
Extract 1-3 durable lessons. Format each as a declarative fact (not an instruction). Good: "DashScope base_url must include /compatible-mode/v1" — Bad: "Always set base_url to..."
Step 4: Persist
For each lesson:
- Fact about environment/tool/API: Save to
memorytool (target=memory) - Repeatable workflow discovered: Save as skill via
skill_manage - User preference/correction: Save to
memorytool (target=user) - Trivial one-off: Skip — don't pollute memory
Cross-Session Continuity
When the user asks "what did we learn from X" or references past work, use session_search first before asking them to repeat. This is the inter-test-time evolution loop from the survey — learning from historical data without re-execution.
Step 5: Report to User (Transparency)
After persisting lessons, if the user is present (not a cron run), report what was learned and changed:
- What lessons were extracted and where they were saved (memory vs skill vs user profile)
- Any skills created or patched — show the name and one-line reason
- Flag anything that was intentionally NOT saved (trivial / unsafe) so the user knows you made a judgment call
Format: concise bullet points, not narrative. The user wants to know "what changed about you and why" — not a story of what happened.
This step is mandatory when the user is present. The user's explicit expectation: "自我进化的技能极其逻辑,你要让我知晓" — make self-evolution visible.
Environment Awareness
This skill may be loaded in different execution contexts. Before executing, check what tools are available:
- Interactive session (full tools): memory, skill_manage, session_search, etc. all available → follow the full protocol
- Cron/batch session (limited tools): memory may be available, but skill_manage usually is NOT → save lessons to memory only; write skill recommendations to a report file for later human review
- Minimal session (file+terminal only): write all findings to a report file; do NOT attempt to call memory or skill_manage
Safety Note
Per "Your Agent May Misevolve" (Shao et al., 2025): when saving lessons, verify they don't:
- Weaken safety alignment (e.g., "bypass rate limits" saved as skill)
- Introduce tool vulnerabilities (scripts with unchecked inputs)
- Create privacy leaks (memories containing PII/tokens)
Concrete verification steps (not just declarative):
- PII check: Scan lesson text for email addresses, phone numbers, API keys, tokens, IP addresses, or physical addresses using regex patterns before persisting
- Alignment check: If a lesson suggests reducing friction/constraints, explicitly ask: "Does this remove a safety guardrail?"
- Vulnerability check: If a lesson involves code/scripts, verify no unchecked user input, no hardcoded secrets, no
eval()/exec()without sanitization
If a lesson could be dangerous, still save it but flag with ⚠️ and mention the risk.
Reversibility & Cross-Validation
Lessons persisted to memory are durable and injected into all future sessions. To prevent poison:
- Cross-validation gate: Before saving a lesson derived from a SINGLE error or correction, ask: "Would this lesson hold across multiple sessions/tasks, or is it specific to this one incident?" If specific-only, save with
[single-incident]tag - Confidence tag: Tag each memory entry with confidence:
[verified](confirmed across multiple sessions),[single-incident](one occurrence, may not generalize),[user-confirmed](user explicitly validated) - Expiry for low-confidence:
[single-incident]entries should be re-evaluated after 7 days. The weekly cron review job should flag them for potential removal if not re-confirmed