gffmerge-model-merging-gnns - SKILL.md Agent Skill

name: gffmerge-model-merging-gnns description: "GFFMERGE methodology for efficient closed-form model merging in Graph Neural Networks. Exploits linear structure of message-passing layers to enable near-quantum accuracy atomistic simulations without retraining foundation models. Applicable to drug discovery, materials science, and general GNN transfer. Activation: GNN model merging, graph neural network transfer, neural force field merging, molecular simulation, atomistic GNN, quantum accuracy GNN, convex embedding alignment" category: "medicine" arxiv_id: "2606.03232"

GFFMERGE: Closed-Form Model Merging for Graph Neural Networks

Core Problem

Adapting Graph Neural Network (GNN) foundation models to new chemical systems requires expensive retraining. GFFMERGE solves this via closed-form model merging — no gradient descent needed.

Key Innovation

Message-passing layers in GNNs have linear structure in their aggregation mechanism. This allows merging two trained GNNs as a convex embedding-alignment problem with an analytical solution — a closed-form formula, not an optimization loop.

Technical Framework

1. Linear Structure Exploitation

Message-passing GNNs: m_ij = φ(h_i, h_j, e_ij), h_i' = ψ(h_i, Σ m_ij)
The aggregation Σ m_ij is linear over model parameters
Two models trained on different domains can be merged by aligning their embedding spaces

2. Convex Embedding Alignment

Formulate merging as: minimize ||W_A · P - W_B||² subject to convexity constraints
P is the alignment matrix (orthogonal/procrustes transform)
Analytical solution: P = UV^T from SVD of W_A^T · W_B
No fine-tuning required for base merge

3. GNNMERGE — Generic Counterpart

Same principle applies to non-force-field GNNs
Enables modular composition of specialized models

Performance Results

Benchmark	Domain	Speedup vs Joint Training	Performance Recovery
MD17	Molecular dynamics	5-27x	≈ gold standard
MD22	Large molecules	5-27x	≈ gold standard
LiPS20	Solid-state	5-27x	≈ gold standard
Generic graphs	General GNN	5-27x	≈ gold standard

Reusable Patterns

Pattern 1: Model Merging for Domain Transfer

Trained GNN_A (domain A) + Trained GNN_B (domain B)
  → Extract linear layer weights {W_A, W_B}
  → Compute SVD of W_A^T · W_B = UΣV^T
  → Alignment matrix P = UV^T
  → Merged model: W_merged = α·W_A·P + (1-α)·W_B
  → Optional: lightweight fine-tuning on small dataset

Pattern 2: Why Vision/Language Merging Fails on GNNs

Vision models: patch-wise independent → simple weight averaging works
Language models: token embeddings are globally aligned
GNNs: node representations are relational — merging must preserve message-passing structure
Solution: align the embedding spaces before merging weights

Pattern 3: Initialization Superiority

GFFMERGE closed-form solution alone outperforms all baseline merging methods
Provides superior initialization for faster fine-tuning convergence
Reduces data requirements for domain adaptation

Application to Drug Discovery & Medicine

Molecular Force Fields: Merge models trained on different molecule classes without retraining
Protein-Ligand Interaction: Combine protein-specific and ligand-specific GNNs
Materials Discovery: Transfer learned representations across crystal systems
Near-Quantum Accuracy: Achieve DFT-level accuracy at GNN inference speed

Pitfalls

Catastrophic failure of vision/Language merging on GNNs: Existing model merging methods (TIES-Merging, Task Arithmetic) fail catastrophically on force field regression tasks
Non-linear readout layers: The linear structure assumption applies to message-passing layers, not necessarily to final readout heads
Embedding dimension mismatch: Both models must have the same hidden dimension for direct alignment

References

arXiv: 2606.03232
Authors: Parth Verma, Parv P. Singh, Vipul Garg, Ishita Thakre, N. M. Anoop Krishnan, Sayan Ranu
Categories: cs.LG, cs.AI