evo-generative-llm-merging - SKILL.md Agent Skill

name: evo-generative-llm-merging description: "Evolutionary Generative Merging (EvoGM) methodology for LLM model merging. Uses evolutionary algorithms to optimize model weight interpolation by generating candidate merge configurations, evaluating them with lightweight benchmarks, and evolving better solutions. Use when: (1) Merging multiple LLM checkpoints or fine-tuned variants, (2) Optimizing merge weights for multi-task performance, (3) Finding Pareto-optimal tradeoffs between capabilities, (4) Replacing grid search for model merging." license: Complete terms in LICENSE.txt metadata: arxiv_id: "2605.29295" published: "2026-05-29" tags: [llm-merging, evolutionary-optimization, model-composition, generative-ai, neural-network-merging]

Evolutionary Generative Merging (EvoGM)

Overview

EvoGM treats LLM model merging as an evolutionary optimization problem. Instead of fixed heuristics (e.g., linear interpolation with uniform weights), it uses an evolutionary algorithm to discover optimal merge configurations across model parameters, layers, or task-specific heads.

Core Methodology

Problem Formulation

Given N model checkpoints {M₁, M₂, ..., Mₙ}, find merge weights W that maximize performance on evaluation tasks:

W* = argmax_W Σᵢ wᵢ · Score_i(Merged(M₁...Mₙ, W))

Evolutionary Algorithm Steps

Initialization: Generate population of random weight configurations
- Each individual = vector of merge weights per layer/group
- Population size: 20-100 (balance exploration vs compute)
Evaluation: Merge models with candidate weights, score on lightweight benchmarks
- Use small validation sets (100-500 samples per task)
- Score = weighted combination of accuracy, perplexity, task-specific metrics
Selection: Select top performers (tournament selection, rank-based)
- Keep top 20-30% for breeding
- Elitism: preserve best individuals across generations
Crossover: Combine parent weight vectors
- Uniform crossover: randomly inherit weights from either parent per layer
- Blend crossover: interpolate between parent weights
Mutation: Perturb weights slightly (Gaussian noise, σ ≈ 0.05-0.1)
- Maintain weight normalization (Σwᵢ = 1 per layer group)
Convergence: Stop when improvement < threshold for K generations, or max generations reached

Layer-Granular Merging

Key insight: Different layers benefit from different merge strategies:

Early layers (embeddings, shallow attention): Task-specific weights
Middle layers (deep attention): Shared representations, uniform weights often work
Late layers (LM head, output): Task-specific fine-tuning benefits most from optimization

Workflow

Step 1: Prepare Models

Collect checkpoints to merge (base model + fine-tunes, or multiple task-specific models)
Identify layer groups for granular weight optimization

Step 2: Define Evaluation

Select lightweight benchmark (subset of validation data)
Define scoring function (weighted multi-task objective)
Set compute budget (max generations, population size)

Step 3: Run Evolutionary Search

Initialize population with diverse strategies (uniform, task-biased, random)
Run evolution loop (merge → evaluate → select → crossover → mutate)
Monitor convergence; stop early if plateau detected

Step 4: Deploy Best Merge

Apply best weight configuration to full models
Validate on held-out test set
Optionally refine with local search around best solution

Pitfalls

Evaluation cost: Each fitness evaluation requires a full model merge + inference pass. Keep population small (20-50) and use tiny benchmarks to stay within compute budget.
Overfitting to benchmark: The evolutionary search can overfit to the small validation set. Always validate the final merge on a held-out test set.
Weight normalization: When merging, ensure layer-wise weight constraints (Σwᵢ = 1). Use softmax parameterization: wᵢ = exp(αᵢ) / Σⱼ exp(αⱼ) to enforce automatically.
Catastrophic interference: Merging models trained on different distributions can cause capability loss. Use task-specific layer groups rather than uniform global weights.
Non-convex landscape: The merge weight landscape is highly non-convex with many local optima. Use sufficient population diversity and mutation rate to escape poor local optima.

Activation Keywords

evolutionary llm merging
evo generative merging
model merge optimization
evolutionary model composition
evogm llm merging
LLM merge weights optimization
模型合并优化