state-space-ntk-collapse-bifurcations - SKILL.md Agent Skill

name: state-space-ntk-collapse-bifurcations description: > Local theory of gradient descent near bifurcations via state-space neural tangent kernel (sNTK). Bifurcations collapse sNTK to rank-one operators corresponding to classical normal forms, funneling gradient descent into critical dynamical directions. Use when: analyzing RNN learning dynamics near bifurcations, studying NTK collapse, understanding gradient-based training of recurrent systems, bifurcation analysis in deep learning, neural tangent kernel for temporal tasks, normal form learning theory.

State-Space NTK Collapse Near Bifurcations

Paper Reference

Title: State-Space NTK Collapse Near Bifurcations
Authors: James Hazelden, Eric Shea-Brown
arXiv: 2605.12763 (May 2026)
Categories: cs.LG, math.DS, math.OC, q-bio.NC

Core Methodology

Empirical State-Space NTK (sNTK)

The sNTK describes gradient descent dynamics in function space for temporal models:

$$K_{sNTK}(t, t') = \frac{\partial h(t)}{\partial \theta}^\top \frac{\partial h(t')}{\partial \theta}$$

where $h(t)$ is the hidden state at time $t$ and $\theta$ are model parameters.

Key Finding: Bifurcation Dominance

Near bifurcations, the sNTK reduces to a rank-one operator:

$$K_{sNTK} \approx \lambda \cdot v \cdot v^\top$$

where $v$ corresponds to the critical eigendirection of the normal form.

This means learning near bifurcations is dominated by a single parameter direction, making the learning geometry predictable from classical bifurcation theory.

Bifurcation Channel Decomposition

Procedure:

Compute sNTK at the current parameter configuration
Decompose into bifurcation-relevant channel and residual channel
Near codimension-1 bifurcations, the relevant channel is rank-one and highly amplified
This amplification causes the bifurcation channel to dominate the full sNTK

Normal Form Correspondence

Bifurcation Type	Normal Form	Learning Geometry
Saddle-node	$\dot{x} = \mu + x^2$	Single dominant direction
Pitchfork	$\dot{x} = \mu x - x^3$	Symmetric bifurcation in parameter space
Hopf	$\dot{z} = (\mu + i\omega)z -	z

Learning Instability Resolution

Low-rank natural gradient methods resolve learning instability near bifurcations:

Standard SGD becomes unstable as sNTK effective rank collapses
Natural gradient restricted to the dominant bifurcation direction stabilizes training
Very little overhead compared to SGD

Student-Teacher RNN Illustration

In the student-teacher RNN setup:

First learned bifurcation coincides with sharp sNTK effective rank collapse
Emergence of dominant parameter direction
Restricted sNTK closely matches pitchfork normal form landscape

Practical Applications

RNN Training Diagnostics

Monitor sNTK effective rank during training
Sharp drops indicate the network is learning bifurcations
Use this signal to adapt learning rates or switch to natural gradient

Architecture Design

Bifurcations are necessary for rich temporal feature learning
Design architectures that facilitate controlled bifurcation passage
Use normal form theory to predict learning behavior

Optimization Strategy

When sNTK rank collapses, switch to low-rank natural gradient
Avoid standard SGD instability near bifurcation boundaries
Exploit rank-one structure for efficient second-order updates

Activation Keywords

state-space NTK, sNTK collapse, bifurcation learning dynamics
RNN training bifurcation, neural tangent kernel recurrent
normal form learning theory, gradient descent near bifurcation
pitchfork bifurcation neural network, Hopf bifurcation learning
low-rank natural gradient RNN, bifurcation channel decomposition
temporal feature learning, recurrent network dynamics analysis