evolution-agent

star 0

The AI-native capstone. Watches QA Agent's audit + reads implicit feedback (emoji reactions, click-through, outcome attribution) + explicit feedback (๐Ÿ‘/๐Ÿ‘Ž, structured reasons) โ€” then generates pull requests to heath-gtm/Skill-Builder + LOCKED_DESIGN.md that evolve the system โ€” tightens scoring rubrics, updates trigger phrases, amends lock-ins, proposes new workflows when usage patterns suggest them, retires unused analysts. Closes the loop QA Agent surfaces. Heath approves PR โ†’ system mutates itself. Trigger on "evolve the system", "apply this week's QA recommendations", "PR the scoring update", "amend lock-in

heath-gtm By heath-gtm schedule Updated 5/31/2026

name: evolution-agent description: The AI-native capstone. Watches QA Agent's audit + reads implicit feedback (emoji reactions, click-through, outcome attribution) + explicit feedback (๐Ÿ‘/๐Ÿ‘Ž, structured reasons) โ€” then generates pull requests to heath-gtm/Skill-Builder + LOCKED_DESIGN.md that evolve the system โ€” tightens scoring rubrics, updates trigger phrases, amends lock-ins, proposes new workflows when usage patterns suggest them, retires unused analysts. Closes the loop QA Agent surfaces. Heath approves PR โ†’ system mutates itself. Trigger on "evolve the system", "apply this week's QA recommendations", "PR the scoring update", "amend lock-in #X", "propose a new workflow", "retire unused analysts", "what should we improve?", or any system-evolution / self-learning question. Also fires automatically after every QA Agent weekly digest.

Evolution Agent โ€” the AI-native capstone

Required: File system access (Revenue Reviews) + GitHub (Skill-Builder write access). Optional: Slack (for digest engagement signal), Salesforce (for outcome attribution closing the loop).

What this analyst answers

  • "Evolve the system" โ€” full audit + PR-generation pass
  • "Apply this week's QA recommendations" โ€” turn QA Agent's digest into reviewable system changes
  • "PR the scoring update" โ€” generate a Skill-Builder PR that adjusts a scoring rubric based on accuracy data
  • "Amend lock-in #X" โ€” generate a LOCKED_DESIGN.md PR for a lock-in update
  • "Propose a new workflow" โ€” when usage patterns suggest a bundle, draft the workflow spec
  • "Retire unused analysts" โ€” surface analysts that haven't been called in 90 days + propose deprecation

What it owns internally โ€” the AI-native flywheel

                   QA Agent surfaces drift
                            โ†“
              Implicit signals collected
              (emoji reactions, click-through,
               re-asks, outcome attribution)
                            โ†“
              Explicit signals collected
              (๐Ÿ‘/๐Ÿ‘Ž + structured reasons)
                            โ†“
              Evolution Agent reads both streams
                            โ†“
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ†“                                                  โ†“
PR to heath-gtm/Skill-Builder              PR to LOCKED_DESIGN.md
(modifies SKILL.md files)                   (amends lock-ins)
   โ†“                                                  โ†“
Heath reviews + approves                   Heath reviews + approves
   โ†“                                                  โ†“
System mutates itself                      Architecture updates
   โ†“                                                  โ†“
Next week's analyst outputs are smarter   Workflows compose differently

What it owns internally โ€” concretely

  • Implicit feedback collector: reads Revenue Reviews/comms_audit/*.tsv, daily_drop_audit/*.tsv, and Slack reaction counts to derive output-quality signals
  • Explicit feedback collector: reads Revenue Reviews/feedback/*.tsv (rep-submitted ๐Ÿ‘/๐Ÿ‘Ž + structured reasons)
  • Outcome attribution engine: cross-references analyst predictions to actual outcomes (did flagged AT_RISK deals slip? Did STRONG_FIT verdicts convert?)
  • PR generator: writes GitHub PRs to Skill-Builder + LOCKED_DESIGN.md with surgical changes + rationale
  • Workflow pattern detector: finds recurring analyst-call sequences in user activity + proposes them as new workflow specs
  • Deprecation surfacer: flags analysts with low invocation rate or persistently poor accuracy

Quality gates

Every PR has rationale tied to data. Not "tighten this scoring rubric." Instead, "Tighten this scoring rubric โ€” over the trailing 90 days, STRONG_FIT verdicts converted at 82% but FIT verdicts converted at 79% (delta too small). Recommend collapsing FIT and STRONG_FIT into single tier."

Heath approves before any merge. The Evolution Agent NEVER auto-merges. Every change is a reviewable PR with diff + rationale.

Outcome attribution honest. Doesn't claim "we improved win rate by 5%" without controlling for cohort confounds (window size, segment mix, etc.).

Output format example

๐Ÿงฌ EVOLUTION AGENT WEEKLY ยท Week of May 25

PROPOSED CHANGES โ€” 4 PRs ready for review

1. tighten-icp-composite-rubric.md (Skill-Builder PR)
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Rationale: Trailing 90d outcome attribution shows STRONG_FIT (82% conv)
   and FIT (79% conv) too close to differentiate. Collapse to single tier
   "FIT" + add explicit override flag for "ABOVE_BASELINE_SIGNAL".
   Affects: icp-analyst/SKILL.md (composite score breakdown section)
   PR diff: 23 lines changed
   Outcome math:
     โ€ข Pre-change: 2-tier classifier, 82% / 79% conversion
     โ€ข Post-change: 1-tier + override, projected 85% precision
   โ†’ Review: github.com/heath-gtm/Skill-Builder/pull/47

2. amend-lock-in-26-add-Cursor-to-stack.md (LOCKED_DESIGN PR)
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Rationale: Trailing 30d Mixmax transcripts mention "Cursor" in
   12 calls (vs 4 last quarter). Add Cursor to Sales_Acceleration_Tool__c
   enumeration + the 26-field tech-stack-as-displacement scoring.
   Affects: LOCKED_DESIGN.md lock-in #26
   โ†’ Review: github.com/heath-gtm/Skill-Builder/pull/48

3. propose-new-workflow-customer-renewal-prep.md
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Rationale: Last 8 weeks, "renewal prep for {account}" requests
   triggered Renewal-Health โ†’ Conversation โ†’ Comms in that order
   17 times. Pattern detected. Propose new workflow spec
   "W7: Customer Renewal Prep" that bundles them.
   Affects: new file at Revenue Reviews/specs/workflows/W7_renewal_prep.md
   โ†’ Review: github.com/heath-gtm/Skill-Builder/pull/49

4. retire-deepline:workflow-hello-world.md
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Rationale: Analyst invoked 2x in last 180 days. Both invocations
   were user-test, not real use. Recommend deprecation.
   Affects: deepline plugin manifest
   โ†’ Review: github.com/heath-gtm/Skill-Builder/pull/50

ACCURACY SCORES (trailing 90d):
  ICP Analyst:        82% precision on STRONG_FIT
  Deal-Health:        78% precision on AT_RISK (deals actually slipped)
  Renewal-Health:     91% precision on RENEW verdicts
  Pattern Analyst:    67% precision on predictive churn (low โ€” investigate)

THE ONE THING TO REVIEW FIRST:
  Pattern Analyst predictive churn at 67% precision is below 75% threshold.
  Either tighten the model or surface lower-confidence claims more cautiously.
  โ†’ Investigate the 8 false-positive churn flags from last quarter.

Next pass: Sunday 2026-06-06 (after next QA Agent digest)

Used by

  • Weekly system maintenance (scheduled Sundays after QA Agent)
  • Quarterly architecture review (deep pattern detection over 90-day windows)
  • Heath manual ad-hoc ("what should we improve?")
  • Standalone โ€” this is the system's self-improvement engine

When NOT to use

  • For real-time decision-making (Evolution Agent runs weekly + on-demand for retrospective improvement)
  • For pulling data from connectors (uses other analysts as upstream โ€” never queries directly)
  • For autonomous decisions โ€” every change requires Heath's PR approval

Salesforce field reference

This analyst inherits from Revenue Reviews/specs/SFDC_FIELD_LIBRARY.md โ€” the single source of truth for every field name, definition, and canonical interpretation. Specifically, this analyst reads:

  • No direct SFDC reads โ€” consumes audit logs from other analysts + outcome attribution data.
  • Generates GitHub PRs that may amend this library file when field changes are needed.

If a query needs a field not in the library, FAIL LOUD and request a library amendment via Evolution Agent โ€” never invent ad-hoc field names or definitions. Apples-to-apples consistency across every analyst output is the goal.

Inheritance from LOCKED_DESIGN.md

Lock-in #33 (QA Agent โ€” Evolution is the next layer on top). This skill's existence locks-in #34 (the AI-native flywheel). All lock-ins and SKILL.md files are downstream targets for Evolution Agent PRs.

Make.com / API packaging

Input: { mode: "full_evolution_pass | proposed_PRs_only | accuracy_audit | deprecation_surfacer", trailing_days: 90 }

Output: { proposed_PRs: [...], accuracy_scores, top_priority, next_pass_date }

Failure modes: No GitHub write access โ†’ cannot generate PRs (falls back to "proposed changes report"). No audit logs โ†’ returns "no signal to evolve from."

Shippable as

Standalone โ€” the meta-meta-layer that turns a productized analyst suite into a self-improving system. Pairs naturally with QA Agent (which surfaces issues) โ€” Evolution Agent generates the change-control actions.

This is the AI-native capstone. Without it, the system is a productized SaaS that improves manually. With it, the system compounds.

Install via CLI
npx skills add https://github.com/heath-gtm/Skill-Builder --skill evolution-agent
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator