evo-generative-llm-merging

star 1

Evolutionary Generative Merging (EvoGM) methodology for LLM model merging. Uses evolutionary algorithms to optimize model weight interpolation by generating candidate merge configurations, evaluating them with lightweight benchmarks, and evolving better solutions. Use when: (1) Merging multiple LLM checkpoints or fine-tuned variants, (2) Optimizing merge weights for multi-task performance, (3) Finding Pareto-optimal tradeoffs between capabilities, (4) Replacing grid search for model merging.

hiyenwong By hiyenwong schedule Updated 6/3/2026

name: evo-generative-llm-merging description: "Evolutionary Generative Merging (EvoGM) methodology for LLM model merging. Uses evolutionary algorithms to optimize model weight interpolation by generating candidate merge configurations, evaluating them with lightweight benchmarks, and evolving better solutions. Use when: (1) Merging multiple LLM checkpoints or fine-tuned variants, (2) Optimizing merge weights for multi-task performance, (3) Finding Pareto-optimal tradeoffs between capabilities, (4) Replacing grid search for model merging." license: Complete terms in LICENSE.txt metadata: arxiv_id: "2605.29295" published: "2026-05-29" tags: [llm-merging, evolutionary-optimization, model-composition, generative-ai, neural-network-merging]

Evolutionary Generative Merging (EvoGM)

Overview

EvoGM treats LLM model merging as an evolutionary optimization problem. Instead of fixed heuristics (e.g., linear interpolation with uniform weights), it uses an evolutionary algorithm to discover optimal merge configurations across model parameters, layers, or task-specific heads.

Core Methodology

Problem Formulation

Given N model checkpoints {M₁, M₂, ..., Mₙ}, find merge weights W that maximize performance on evaluation tasks:

W* = argmax_W Σᵢ wᵢ · Score_i(Merged(M₁...Mₙ, W))

Evolutionary Algorithm Steps

  1. Initialization: Generate population of random weight configurations

    • Each individual = vector of merge weights per layer/group
    • Population size: 20-100 (balance exploration vs compute)
  2. Evaluation: Merge models with candidate weights, score on lightweight benchmarks

    • Use small validation sets (100-500 samples per task)
    • Score = weighted combination of accuracy, perplexity, task-specific metrics
  3. Selection: Select top performers (tournament selection, rank-based)

    • Keep top 20-30% for breeding
    • Elitism: preserve best individuals across generations
  4. Crossover: Combine parent weight vectors

    • Uniform crossover: randomly inherit weights from either parent per layer
    • Blend crossover: interpolate between parent weights
  5. Mutation: Perturb weights slightly (Gaussian noise, σ ≈ 0.05-0.1)

    • Maintain weight normalization (Σwᵢ = 1 per layer group)
  6. Convergence: Stop when improvement < threshold for K generations, or max generations reached

Layer-Granular Merging

Key insight: Different layers benefit from different merge strategies:

  • Early layers (embeddings, shallow attention): Task-specific weights
  • Middle layers (deep attention): Shared representations, uniform weights often work
  • Late layers (LM head, output): Task-specific fine-tuning benefits most from optimization

Workflow

Step 1: Prepare Models

  • Collect checkpoints to merge (base model + fine-tunes, or multiple task-specific models)
  • Identify layer groups for granular weight optimization

Step 2: Define Evaluation

  • Select lightweight benchmark (subset of validation data)
  • Define scoring function (weighted multi-task objective)
  • Set compute budget (max generations, population size)

Step 3: Run Evolutionary Search

  • Initialize population with diverse strategies (uniform, task-biased, random)
  • Run evolution loop (merge → evaluate → select → crossover → mutate)
  • Monitor convergence; stop early if plateau detected

Step 4: Deploy Best Merge

  • Apply best weight configuration to full models
  • Validate on held-out test set
  • Optionally refine with local search around best solution

Pitfalls

  • Evaluation cost: Each fitness evaluation requires a full model merge + inference pass. Keep population small (20-50) and use tiny benchmarks to stay within compute budget.
  • Overfitting to benchmark: The evolutionary search can overfit to the small validation set. Always validate the final merge on a held-out test set.
  • Weight normalization: When merging, ensure layer-wise weight constraints (Σwᵢ = 1). Use softmax parameterization: wᵢ = exp(αᵢ) / Σⱼ exp(αⱼ) to enforce automatically.
  • Catastrophic interference: Merging models trained on different distributions can cause capability loss. Use task-specific layer groups rather than uniform global weights.
  • Non-convex landscape: The merge weight landscape is highly non-convex with many local optima. Use sufficient population diversity and mutation rate to escape poor local optima.

Activation Keywords

  • evolutionary llm merging
  • evo generative merging
  • model merge optimization
  • evolutionary model composition
  • evogm llm merging
  • LLM merge weights optimization
  • 模型合并优化
Install via CLI
npx skills add https://github.com/hiyenwong/ai_collection --skill evo-generative-llm-merging
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator