eeg-foundation-lrp-interpretability - SKILL.md Agent Skill

name: eeg-foundation-lrp-interpretability description: "Layer-wise Relevance Propagation (LRP) methodology for interpreting EEG foundation models. Extends LRP from CNN-based to Transformer-based EEG models, enabling verification and hypothesis discovery. Activation: EEG interpretability, LRP, EEG foundation model, transformer attribution, Clever Hans EEG, post-hoc attribution."

EEG Foundation Model Interpretability via LRP

Methodology for applying Layer-wise Relevance Propagation (LRP) to EEG foundation models for post-hoc attribution, based on Meyer zu Bexten et al. (2026), arXiv:2605.11885.

Core Concept

EEG foundation models (FMs) promise scalable deep learning for diagnostics and brain-computer interfaces despite data scarcity, but their opaque Transformer architectures remain a barrier. This methodology extends LRP — a well-established post-hoc attribution method for CNNs — to Transformer-based EEG models.

Key Contributions

LRP for EEG Transformers: First systematic application of attention-aware LRP to Transformer-based EEG FMs
Verification: LRP can verify that FM decisions align with expected neurophysiological patterns
Discovery: LRP surfaces novel, biologically plausible hypotheses from FM behavior

Application to LaBraM

Motor Imagery (PhysioMI)

Finding: "Clever Hans" behavior — model prioritizes task-correlated ocular signals (EOG artifacts) over intended motor cortex correlates
Implication: FMs may learn spurious correlations that work in-distribution but fail to capture intended neural mechanisms
LRP heatmap pattern: Attribution concentrated on frontal channels (eye movement artifacts) rather than sensorimotor regions

Affective Prediction (Naturalistic Paradigm)

Finding: Recurring reliance on central electrode cluster for arousal prediction
Hypothesis: Central electrodes may capture a sensorimotor signature of arousal state
Valence prediction: Less consistent attribution patterns, suggesting valence is harder to decode from EEG alone

R-Peak Detection (Verification Task)

Purpose: Well-understood physiological signal used to verify LRP correctness
Result: ~75% balanced accuracy, attribution patterns consistent with expected ECG morphology
Significance: Validates that LRP produces meaningful attributions before applying to less-understood tasks

Implementation

# LRP for Transformer-based EEG models
def apply_lrp_to_eeg_transformer(model, eeg_input, lrp_rule="epsilon"):
    """Apply Layer-wise Relevance Propagation to EEG Transformer.
    
    Args:
        model: Trained EEG foundation model (Transformer-based)
        eeg_input: EEG tensor [batch, channels, time]
        lrp_rule: Propagation rule (epsilon, gamma, z-plus)
    
    Returns:
        relevance_map: Attribution per electrode and time point
    """
    # Forward pass with relevance tracking
    model.eval()
    output = model(eeg_input)
    
    # Initialize relevance at output layer
    relevance = output  # target class relevance = 1, others = 0
    
    # Backward propagation through Transformer layers
    for layer in reversed(model.transformer_layers):
        # Attention-aware LRP
        relevance = lrp_attention_layer(layer, relevance, rule=lrp_rule)
        relevance = lrp_feedforward_layer(layer, relevance, rule=lrp_rule)
    
    # Map back to input electrode-time space
    relevance_map = lrp_input_projection(relevance)
    
    return relevance_map

LRP Rules for EEG Transformers

Rule	Description	Best For
ε-rule	Stabilizes division with small epsilon	General purpose, default choice
γ-rule	Emphasizes positive contributions	Excitatory pattern detection
z-plus	Propagates only positive relevance	Binary classification

Pitfalls

Heatmap ambiguity: In complex domains like EEG, heatmaps are suggestive but not definitive. Always cross-validate with domain knowledge.
Clever Hans detection is crucial: FMs may exploit artifacts (EOG, EMG) rather than neural signals. LRP helps identify this but requires careful interpretation.
Transformer-specific LRP: Standard LRP rules designed for CNNs need adaptation for attention mechanisms. Use attention-aware propagation.
Performance vs attribution tradeoff: Finetuned vs from-scratch training shows minimal performance difference, but attribution patterns may differ significantly.
Baseline comparison: Always compare FM attribution patterns with established methods (e.g., CSP-LDA) to validate biological plausibility.

Verification Strategy

Known signal validation: Apply LRP to tasks with well-understood neurophysiology (R-Peak detection, P300)
Artifact detection: Use LRP to identify if model relies on non-neural signals (EOG, ECG, muscle artifacts)
Cross-paradigm comparison: Compare attribution patterns across different experimental paradigms
Expert validation: Have neuroscientists/clinicians review LRP heatmaps for biological plausibility

Related Skills

eeg-foundation-model-adapters: EEG foundation model domain adaptation
eeg-preprocessing-reliability: EEG decoding reliability and preprocessing assessment
neural-encoding-evaluation-ground-truth: Ground-truth approximation for neural encoding evaluation