eeg-biomarker-robustness-cross-population - SKILL.md Agent Skill

name: eeg-biomarker-robustness-cross-population description: "Cross-population framework for evaluating robustness and generalizability of EEG biomarkers in multi-site clinical settings. Addresses cross-subject and cross-platform variation for reliable Parkinson's disease detection. Keywords: EEG biomarkers, cross-population, generalization, multi-site, Parkinson's disease, clinical reliability."

Robust and Clinically Reliable EEG Biomarkers: A Cross Population Framework

Framework for developing EEG biomarkers that generalize across populations, sites, and recording platforms for reliable clinical deployment.

Metadata

Source: arXiv:2604.23933v1
Authors: Nicholas R. Rasmussen, Longwei Wang, Rodrigue Rizk, et al.
Published: 2026-04-27

Core Methodology

The Cross-Population Challenge

EEG biomarkers often fail when deployed across different:

Subjects: Individual physiological differences
Sites: Different hospitals/clinics with varying protocols
Platforms: Different EEG hardware and software
Populations: Different demographics, disease stages

Three-Pillar Evaluation Framework

┌─────────────────────────────────────────────────────────────┐
│              ROBUST EEG BIOMARKER FRAMEWORK                  │
├─────────────────┬─────────────────┬─────────────────────────┤
│   INTERNAL      │   EXTERNAL      │   CLINICAL              │
│   RELIABILITY   │   RELIABILITY   │   UTILITY               │
├─────────────────┼─────────────────┼─────────────────────────┤
│ • Test-retest   │ • Cross-site    │ • Diagnostic            │
│   stability     │   generalization│   accuracy              │
│ • Split-half    │ • Cross-platform│ • Prognostic            │
│   consistency   │   robustness    │   value                 │
│ • Intra-subject │ • Cross-population• Treatment           │
│   variance      │   transfer      │   monitoring            │
└─────────────────┴─────────────────┴─────────────────────────┘

Implementation Guide

Step 1: Biomarker Feature Extraction

import numpy as np
from scipy import signal
from sklearn.preprocessing import StandardScaler

class EEGBiomarkerExtractor:
    """
    Extract robust EEG biomarkers for clinical applications
    """
    def __init__(self, fs=500):
        self.fs = fs
        self.bands = {
            'delta': (0.5, 4),
            'theta': (4, 8),
            'alpha': (8, 13),
            'beta': (13, 30),
            'gamma': (30, 100)
        }
    
    def extract_spectral_features(self, eeg_data):
        """
        Extract band power and spectral features
        
        Args:
            eeg_data: (channels, time) array
        
        Returns:
            Dictionary of spectral features
        """
        features = {}
        
        for band_name, (low, high) in self.bands.items():
            # Bandpass filter
            sos = signal.butter(4, [low, high], btype='band', fs=self.fs, output='sos')
            filtered = signal.sosfilt(sos, eeg_data, axis=-1)
            
            # Compute power
            power = np.mean(filtered ** 2, axis=-1)
            features[f'{band_name}_power'] = power
            
        # Relative power ratios
        total_power = sum(features[f'{b}_power'] for b in self.bands.keys())
        for band in self.bands.keys():
            features[f'{band}_relative'] = features[f'{band}_power'] / (total_power + 1e-10)
        
        # Alpha peak frequency (individualized)
        freqs, psd = signal.welch(eeg_data, fs=self.fs, nperseg=self.fs*2)
        alpha_mask = (freqs >= 8) & (freqs <= 13)
        features['alpha_peak'] = freqs[alpha_mask][np.argmax(psd[:, alpha_mask], axis=1)]
        
        return features
    
    def extract_connectivity_features(self, eeg_data):
        """Extract functional connectivity features"""
        from scipy.stats import spearmanr
        
        n_channels = eeg_data.shape[0]
        connectivity = np.zeros((n_channels, n_channels))
        
        for i in range(n_channels):
            for j in range(i+1, n_channels):
                # Phase locking value (PLV)
                phase_i = np.angle(signal.hilbert(eeg_data[i]))
                phase_j = np.angle(signal.hilbert(eeg_data[j]))
                plv = np.abs(np.mean(np.exp(1j * (phase_i - phase_j))))
                connectivity[i, j] = connectivity[j, i] = plv
        
        return {'plv_connectivity': connectivity}

Step 2: Robustness Evaluation

from sklearn.model_selection import LeaveOneGroupOut, cross_val_score
from sklearn.ensemble import RandomForestClassifier

class BiomarkerRobustnessEvaluator:
    """
    Evaluate biomarker robustness across populations
    """
    def __init__(self, biomarker_extractor):
        self.extractor = biomarker_extractor
        self.classifier = RandomForestClassifier(n_estimators=100)
    
    def evaluate_internal_reliability(self, X, y, subject_ids):
        """
        Test-retest reliability and split-half consistency
        """
        results = {}
        
        # Test-retest (repeated recordings from same subjects)
        unique_subjects = np.unique(subject_ids)
        test_retest_scores = []
        
        for subject in unique_subjects:
            mask = subject_ids == subject
            if np.sum(mask) >= 2:
                subject_features = X[mask]
                # Correlation between sessions
                corr = np.corrcoef(subject_features)[0, 1]
                test_retest_scores.append(corr)
        
        results['test_retest_icc'] = np.mean(test_retest_scores)
        
        # Split-half reliability
        from sklearn.model_selection import ShuffleSplit
        cv = ShuffleSplit(n_splits=10, test_size=0.5)
        split_scores = cross_val_score(self.classifier, X, y, cv=cv)
        results['split_half_accuracy'] = np.mean(split_scores)
        
        return results
    
    def evaluate_external_reliability(self, X, y, site_ids, platform_ids):
        """
        Cross-site and cross-platform generalization
        """
        results = {}
        
        # Cross-site validation
        logo = LeaveOneGroupOut()
        site_scores = cross_val_score(
            self.classifier, X, y, 
            cv=logo.split(X, y, site_ids)
        )
        results['cross_site_accuracy'] = np.mean(site_scores)
        
        # Cross-platform validation
        platform_scores = cross_val_score(
            self.classifier, X, y,
            cv=logo.split(X, y, platform_ids)
        )
        results['cross_platform_accuracy'] = np.mean(platform_scores)
        
        return results
    
    def evaluate_clinical_utility(self, X_train, y_train, X_test, y_test):
        """
        Diagnostic accuracy and prognostic value
        """
        from sklearn.metrics import roc_auc_score, precision_recall_fscore_support
        
        self.classifier.fit(X_train, y_train)
        y_pred = self.classifier.predict(X_test)
        y_prob = self.classifier.predict_proba(X_test)[:, 1]
        
        results = {
            'auc_roc': roc_auc_score(y_test, y_prob),
            'accuracy': np.mean(y_pred == y_test),
            'precision': precision_recall_fscore_support(y_test, y_pred, average='binary')[0],
            'recall': precision_recall_fscore_support(y_test, y_pred, average='binary')[1],
            'f1': precision_recall_fscore_support(y_test, y_pred, average='binary')[2]
        }
        
        return results

Step 3: Harmonization Pipeline

class EEGHarmonization:
    """
    Combat harmonization for multi-site EEG data
    """
    def __init__(self):
        from sklearn.linear_model import LinearRegression
        self.model = LinearRegression()
    
    def fit(self, X, site_ids):
        """
        Learn site-specific effects (ComBat-style)
        """
        self.site_effects = {}
        overall_mean = np.mean(X, axis=0)
        
        for site in np.unique(site_ids):
            site_mask = site_ids == site
            self.site_effects[site] = {
                'mean': np.mean(X[site_mask], axis=0) - overall_mean,
                'std': np.std(X[site_mask], axis=0)
            }
        
        self.overall_mean = overall_mean
        return self
    
    def transform(self, X, site_ids):
        """Remove site effects"""
        X_harmonized = X.copy()
        
        for site in np.unique(site_ids):
            site_mask = site_ids == site
            X_harmonized[site_mask] -= self.site_effects[site]['mean']
        
        return X_harmonized

Applications

Parkinson's Disease: Reliable EEG biomarkers for early detection
Alzheimer's Disease: Cross-site validation of diagnostic markers
Depression: Objective biomarkers for treatment monitoring
Clinical Trials: Standardized biomarkers for drug development

Pitfalls

Overfitting to Training Site: May not generalize to new sites
Platform Differences: Sampling rates, electrode positions vary
Population Bias: Training data may not represent target population
Temporal Drift: Biomarkers may degrade over time

Related Skills

eeg-tinnitus-biomarker-robustness
eeg-hopfield-emotion-energy
tms-eeg-biomarkers
explainable-gnn-eeg-neurological

References

Rasmussen et al. (2026) Robust EEG Biomarkers, arXiv:2604.23933
Fortin et al. (2017) Harmonization of multi-site diffusion tensor imaging data
Combrisson & Jerbi (2015) Exceeding chance level by chance