scalable-on-hardware-qnn-training

star 2

Scalable on-hardware QNN training methodology for clinical data. Butterfly circuit architecture with O(n log n) parameters, layer-wise training strategy, and parallelised parameter-shift rule reducing gradient estimation from O(n^2) to O(log n). Validated on MIMIC-III clinical data on IonQ hardware at 16-32 qubits. arXiv: 2606.03517

hiyenwong By hiyenwong schedule Updated 6/12/2026

name: scalable-on-hardware-qnn-training description: Scalable on-hardware QNN training methodology for clinical data. Butterfly circuit architecture with O(n log n) parameters, layer-wise training strategy, and parallelised parameter-shift rule reducing gradient estimation from O(n^2) to O(log n). Validated on MIMIC-III clinical data on IonQ hardware at 16-32 qubits. arXiv: 2606.03517 category: quantum-medical authors: Natansh Mathur, Panagiotis Kl. Barkoutsos, Masako Yamada, Martin Roetteler, Iordanis Kerenidis source: "arxiv:2606.03517" created: "2026-06-10" version: "1.0.0" tags: ["quantum-neural-network", "clinical-data", "hardware-training", "butterfly-circuit", "gradient-estimation"]

Scalable On-Hardware QNN Training

Overview

Methodology for training Quantum Neural Networks (QNNs) directly on quantum hardware at scale, validated on clinical data imputation using the MIMIC-III electronic health record dataset.

Paper: Scalable On-Hardware Training of Quantum Neural Networks and Application to Clinical Data Imputation
arXiv: 2606.03517
Authors: Mathur, Barkoutsos, Yamada, Roetteler, Kerenidis

Core Problem

Training QNNs on quantum hardware is bottlenecked by gradient estimation cost: standard parameter-shift methods require O(n^2) circuit evaluations (quadratic in trainable parameters), making hardware-based optimization impractical beyond small systems.

Solution Architecture

Three Co-Designed Components

  1. Butterfly Circuit Architecture

    • Structured, subspace-preserving circuit with O(n log n) parameters
    • Logarithmic circuit depth
    • Exploits commuting structure within layers
  2. Layer-Wise Training Strategy

    • Confines on-hardware optimization to one small, well-structured layer at a time
    • Avoids global parameter optimization instability
    • Enables scaling to larger qubit counts
  3. Parallelised Parameter-Shift Rule

    • Exploits commuting structure within each Butterfly layer
    • Extracts all gradients in a constant number of circuit executions
    • Reduces distinct circuit evaluations per step from O(n^2) to O(log n)

Clinical Application: MIMIC-III Data Imputation

Dataset

  • MIMIC-III: Electronic health record dataset
  • Task: Clinical data imputation for downstream patient survival prediction
  • Sensitive to optimization instability and model variance

Results

  • Trained directly on IonQ Forte Enterprise trapped-ion hardware at 16 qubits
  • No performance degradation vs ideal/noisy simulation
  • Tensor-network simulation validated at 32 qubits
  • 32-qubit inference executed on hardware
  • Match or exceed strong classical neural baselines in patient survival prediction
  • Reduced variance across training runs

Implementation Guide

Step 1: Design Butterfly Circuit

# Pseudocode for Butterfly layer construction
def butterfly_layer(n_qubits):
    """Construct a subspace-preserving Butterfly circuit layer.
    O(n log n) parameters, logarithmic depth.
    Exploits commuting structure for parallel gradient extraction."""
    # Layer consists of commuting 2-qubit gates arranged in butterfly pattern
    # Each gate parameterized with rotation angles
    pass

Step 2: Layer-Wise Training Loop

# Pseudocode for layer-wise training
def train_layerwise(qnn, data, n_layers):
    """Train QNN one layer at a time on hardware."""
    for layer_idx in range(n_layers):
        # Freeze all layers except current
        qnn.freeze_except(layer_idx)
        
        # Optimize current layer on hardware
        gradients = parallel_parameter_shift(qnn, layer_idx)
        qnn.update_layer(layer_idx, gradients)
        
        # Evaluate on validation set
        metrics = evaluate(qnn, val_data)

Step 3: Parallel Parameter-Shift

# Pseudocode for parallel gradient extraction
def parallel_parameter_shift(qnn, layer_idx):
    """Extract all gradients for a layer in constant circuit executions.
    Exploits commuting structure of Butterfly layer gates."""
    # Group commuting gates
    commutation_groups = find_commuting_groups(qnn, layer_idx)
    
    # Each group can be measured in parallel
    all_gradients = []
    for group in commutation_groups:
        results = measure_in_parallel(qnn, group)
        gradients = compute_shift_gradients(results)
        all_gradients.extend(gradients)
    
    return all_gradients

Scalability

Qubit Count Training Method Hardware
16 Direct hardware training IonQ Forte Enterprise
32 Tensor-network simulation + hardware inference IonQ Forte Enterprise

When to Use

  • Designing quantum neural networks for clinical/healthcare data
  • Need to train QNNs directly on NISQ hardware
  • Facing gradient estimation bottlenecks in parameter-shift methods
  • Building hybrid classical-quantum models for EHR/imputation tasks
  • Need scalable QNN training beyond 10-15 qubits

Key Insights

  1. Structured circuits matter: Butterfly architecture provides both expressivity and trainability
  2. Layer-wise > global: Training layers sequentially avoids barren plateaus
  3. Commuting = parallel: Commuting gate structure enables constant-cost gradient extraction
  4. Clinical data is demanding: MIMIC-III is sensitive to optimization instability - good benchmark

Activation Keywords

scalable qnn training, butterfly circuit, layer-wise training, clinical data imputation, MIMIC-III, parallel parameter-shift, hardware optimization, IonQ, trapped-ion, gradient estimation, hybrid quantum-classical, electronic health records, patient survival prediction, O(log n) gradient, subspace-preserving circuit

Install via CLI
npx skills add https://github.com/hiyenwong/ai_collection --skill scalable-on-hardware-qnn-training
Repository Details
star Stars 2
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator