quantum-on-hardware-qnn-training - SKILL.md Agent Skill

name: quantum-on-hardware-qnn-training description: Scalable on-hardware quantum neural network training methodology using Butterfly circuits, layer-wise optimization, and parallelized parameter-shift rules. Reduces gradient estimation cost from O(n²) to O(log n). category: quantum-medical created: 2026-06-11 tags: [quantum, qnn, hardware, clinical, butterfly-circuit, parameter-shift, medical] activation: quantum neural network training, on-hardware QNN, butterfly circuit, clinical data imputation, gradient estimation, parameter-shift, MIMIC, trapped-ion source_paper: "arXiv:2606.03517 - Scalable On-Hardware Training of QNNs and Clinical Data Imputation"

Scalable On-Hardware QNN Training

Context

Training quantum neural networks (QNNs) on quantum hardware is bottlenecked by gradient estimation cost: standard parameter-shift methods require O(n²) circuit evaluations with trainable parameters, making hardware-based optimization impractical beyond small system sizes.

Core Methodology

1. Butterfly Circuit Architecture

Use structured, subspace-preserving Butterfly circuit with O(n log n) parameters
Achieves logarithmic depth circuit design
Exploits commuting structure within each layer for parallel gradient extraction

2. Layer-Wise Training Strategy

Confine on-hardware optimization to one small, well-structured layer at a time
Avoid optimizing all parameters simultaneously
Build network incrementally layer by layer

3. Parallelized Parameter-Shift Rule

Exploit commuting structure within each Butterfly layer
Extract all gradients in constant number of circuit executions per layer
Reduces distinct circuit evaluations per optimization step from O(n²) to O(log n)

Application: Clinical Data Imputation

MIMIC-III Electronic Health Records

Use as demanding benchmark sensitive to optimization instability and model variance
Build hybrid classical-quantum models for clinical data
Train directly on trapped-ion hardware (e.g., IonQ Forte Enterprise) at 16+ qubits
Validate with tensor-network simulation at 32 qubits

Validation Metrics

Match or exceed classical neural baselines in downstream prediction tasks
Demonstrate reduced variance across runs
Execute inference on real hardware without performance degradation relative to simulation

Implementation Steps

Design Butterfly Circuit
- Create structured ansatz with O(n log n) parameters
- Ensure subspace-preserving property
- Verify logarithmic circuit depth
Implement Parallelized Parameter-Shift
- Identify commuting parameter groups within each layer
- Design circuit evaluations to extract all gradients simultaneously
- Validate gradient accuracy against numerical differentiation
Layer-Wise Training Loop
- Initialize first layer, train to convergence
- Freeze trained layer, add next layer
- Repeat until full network depth achieved
- Optional: fine-tune all layers with reduced learning rate
Hardware Deployment
- Compile circuits for target hardware (trapped-ion preferred)
- Validate on 16+ qubit systems
- Scale to 32+ qubits via tensor-network simulation
- Execute inference on real hardware

Key Benefits

Scalability: O(log n) gradient estimation vs O(n²) for standard approaches
Hardware Feasibility: Makes QNN training practical on NISQ-era devices
Clinical Relevance: Direct application to medical data imputation and prediction
Reduced Variance: More stable optimization across training runs

Pitfalls

Butterfly architecture restricts expressivity — verify task compatibility
Layer-wise training may get stuck in local optima — consider fine-tuning phase
Hardware noise still affects results — use error mitigation techniques
Commuting parameter identification requires careful circuit analysis

Verification

Compare training loss curves against standard parameter-shift baseline
Validate gradient correctness via numerical differentiation on small circuits
Check that 16-qubit hardware results match noisy simulation predictions
Ensure downstream task performance matches classical baselines