decoding-encoding-alignment-critique - SKILL.md Agent Skill

name: decoding-encoding-alignment-critique description: > Critical analysis framework for brain-model alignment methodology. Demonstrates that representational similarity analysis (RSA) and decoding-based alignment metrics are fundamentally insensitive to encoding manifold topology. Similar decoding behavior and high representational alignment can arise from small, non-representative neuron subpopulations. Use when: evaluating brain-DNN alignment, RSA/DSA methodology, encoding vs decoding analysis, neural representation comparison, brain-model similarity metrics, neuroscience interpretability. Activation: decoding alignment, encoding alignment, RSA critique, representational similarity analysis, brain model alignment critique, encoding manifold, decoding manifold, neural representation comparison, alignment methodology.

Decoding-Encoding Alignment Critique

Fundamental critique of similarity analysis methods in neuroscience. Shows that decoding-based alignment metrics (RSA, DSA) are misleading because they can be driven by small, non-representative neuron subpopulations.

Core Argument

Popular methods (RSA, DSA, perceptual manifolds) assume that similarity in decoding representations implies similar computation. This is not necessarily true:

Subpopulation dominance: High alignment can be driven by a tiny subset of neurons
Encoding insensitivity: Alignment metrics are blind to encoding manifold topology
Causal evidence: Decoding metrics unchanged when encoding topology is manipulated

Key Findings

1. Subpopulation Effect

Similar decoding behavior and high representational alignment can arise from small, non-representative subpopulations of neurons
The representational geometry of a population may be shaped by very few neurons
Alignment to a few neurons ≠ alignment to the whole population

2. Encoding vs Decoding Manifolds

Decoding manifold: How well stimuli can be read out (what most metrics measure)
Encoding manifold: How neurons are globally organized in their responses (what alignment metrics should also consider)
The complementary encoding paradigm characterizes global neuron organization and reveals differentiation that decoding metrics miss

3. Causal Evidence (MNIST)

Decoding metrics remain unchanged when encoding topology is causally manipulated via training loss
This proves decoding similarity ≠ computational similarity

Methodology: Complementary Encoding Analysis

When evaluating brain-model alignment, go beyond RSA/DSA:

Check subpopulation contribution: Does alignment hold when excluding top-K neurons?
Analyze encoding topology: How is function distributed across the population?
Use encoding manifolds: Characterize neuron response organization globally
Causal intervention: Manipulate encoding and verify decoding metrics respond

Practical Guidelines

Situation	Recommended Action
RSA/DSA shows high alignment	Verify with subpopulation ablation
Comparing brain-DNN representations	Add encoding manifold analysis
Publishing alignment results	Include encoding topology metrics
Evaluating model-brain similarity	Use complementary encoding paradigm

Gradient-Level Alignment Analysis (2026-06-01 Addition)

Standard brain-model alignment only tests forward activations. arXiv:2605.28693 extends encoding analysis to backpropagated gradients:

Traditional encoding: neural_response = W * forward_activation + b
Gradient encoding: neural_response = W * backprop_gradient + b

Key finding: DINOv3 gradients CAN predict fMRI/MEG signals (higher visual cortex, later latencies), but their spatial/temporal organization diverges from biologically plausible backpropagation. Forward activations show strong hierarchical alignment; gradients do not.

Implication: Alignment studies should test multiple computational levels (activations, gradients, optimization dynamics) — representation similarity alone is insufficient to claim mechanistic alignment. This extends the subpopulation critique: even when decoding metrics agree, the learning mechanism may be fundamentally different.

Reusable Pattern: Gradient Encoding Pipeline

def gradient_encoding_analysis(model, images, neural_data):
    activations, gradients = {}, {}
    for layer in model.layers:
        layer.register_forward_hook(capture(activations, layer.name))
        layer.register_full_backward_hook(capture(gradients, layer.name))
    output = model(images)
    loss = some_objective(output)
    loss.backward()
    return {name: ridge_regression_predict(grad.flatten(), neural_data)
            for name, grad in gradients.items()}

When This Matters

Brain-DNN comparison studies
NeuroAI model validation
Cross-species representation comparison
Model interpretability in neuroscience
Evaluating whether AI models "think like brains"

References

Paper: arXiv:2605.05907 (40 pages, 27 figures)
Authors: Bertram, Dyballa, Keller, Kinger, Zucker