eeg-staged-representation-learning - SKILL.md Agent Skill

name: eeg-staged-representation-learning description: "Neuroscience-inspired staged representation learning framework for EEG visual decoding. Organizes EEG representation learning into three complementary phases: low-level visual, high-level semantic, and integrative fusion, with disentangled coarse/fine-grained semantics." source: arXiv 2605.16923 tags: - eeg - visual-decoding - representation-learning - brain-computer-interface - staged-learning - neuro-inspired authors: Xiang Gao, Hui Tian, Yanming Zhu, Xuefei Yin, Alan Wee-Chung Liew published: 2026-05-21

Neuroscience-inspired Staged Representation Learning for EEG Visual Decoding

Overview

Proposes a neuroscience-inspired staged representation learning framework that reformulates EEG visual decoding as a stage-specific representation decomposition problem. Instead of learning a single global EEG embedding for cross-modal alignment, this framework explicitly models the staged and hierarchical characteristics of human visual processing.

Core Innovation

EEG visual decoding → decompose into three complementary representation stages
Introduces multimodal dual-level semantic learning: separates coarse label-level semantics from fine image-level visual-semantic information
Semantic latent channels: computational representation channels generated from observed visual EEG signals, expanding channel-level semantic representation space

Framework Architecture

Stage 1: Low-level Visual Representation Learning

Extracts basic visual features from EEG signals (edges, textures, shapes)
Corresponds to early visual cortex (V1-V2) processing
Uses convolutional encoders to capture spatiotemporal patterns

Stage 2: High-level Semantic Representation Learning

Extracts abstract semantic concepts from EEG
Corresponds to higher visual cortex (IT, PFC) processing
Leverages dual-level semantic learning:
- Coarse label-level: Category-level semantics (e.g., "face", "animal")
- Fine image-level: Instance-specific visual-semantic information

Stage 3: Integrative Information Fusion

Fuses low-level visual and high-level semantic representations
Produces unified EEG embedding for cross-modal alignment
Uses cross-attention mechanisms for integration

Semantic Latent Channels

Generated from observed visual EEG signals
Expand the channel-level semantic representation space
Enable structured semantic abstraction and cross-modal alignment
Different from standard EEG channels — they are learned computational channels

Technical Details

Multimodal Dual-Level Semantic Learning

Coarse semantics: Aligns EEG embedding with class-level semantic labels (e.g., WordNet categories)
Fine semantics: Aligns EEG embedding with image-level visual features from vision models (e.g., CLIP)
Both levels are trained simultaneously with contrastive objectives

Benchmark Performance (THINGS-EEG)

Subject-dependent zero-shot: Superior performance achieved
Subject-independent zero-shot: Improved exact retrieval
Comprehensive ablations validate staged decomposition approach

Analysis

Layer-wise retrieval: Deeper stages capture more semantic information
Temporal accumulation: Later temporal windows contribute more to semantic decoding
Expanded multi-image retrieval: Framework scales with additional images

Key Insights

Hierarchical processing matters: Explicitly modeling staged perception → semantic → integrative representations outperforms monolithic embedding approaches
Disentanglement helps: Separating coarse and fine semantics improves both classification and retrieval
Neuro-inspired design: The staged framework mirrors the ventral visual stream's hierarchical organization (V1 → V2 → V4 → IT)

Applications

EEG-based visual decoding: Zero-shot classification and retrieval
BCI communication: More accurate visual prosthetics
Cognitive neuroscience: Probing hierarchical visual processing through EEG
Medical rehabilitation: Visual assessment for locked-in patients

Related Skills

eeg-visual-attention-decoding
eeg-structure-guided-diffusion
meta-learning-in-context-brain-decoding
eeg2vision-multimodal-eeg-framework-2d-visual

Activation

staged eeg representation, EEG visual decoding, coarse-to-fine semantics, semantic latent channels, neuro-inspired EEG, staged representation learning, THINGS-EEG benchmark, EEG cross-modal alignment