name: quanforge-qnn-testing description: "Mutation testing framework for Quantum Neural Networks (QNNs) based on the QuanForge methodology (arXiv:2604.20706). Use this skill when testing QNN robustness, analyzing quantum circuit vulnerabilities, performing mutation testing on quantum ML models, localizing weak regions in quantum circuits, or comparing QNN test suites. Also triggered by keywords: quantum testing, QNN testing, mutation testing, 量子测试, 量子神经网络测试."
QuanForge: Mutation Testing for Quantum Neural Networks
A comprehensive skill for testing Quantum Neural Networks (QNNs) using mutation testing methodology, based on the QuanForge framework (arXiv:2604.20706).
When to Use
Activate this skill when the user needs to:
- Test QNN robustness — evaluate how well a trained quantum neural network resists circuit-level faults
- Perform mutation testing on quantum circuits — systematically inject faults and measure detection rates
- Compare QNN test suites — determine which test data better exposes circuit vulnerabilities
- Localize vulnerable circuit regions — identify which layers or gates are most prone to failure
- Assess QNN structural quality — guide data enhancement and circuit redesign decisions
- Evaluate noise resilience — assess mutation killing under simulated noisy quantum conditions
Activation keywords: quantum testing, QNN testing, mutation testing, 量子测试, 量子神经网络测试
Core Concepts
Mutation Testing for QNNs
Mutation testing is a fault-based testing technique where small, deliberate changes (mutations) are introduced into a program to evaluate test quality. For QNNs, mutations are applied to the quantum circuit after training to simulate potential errors. A mutant is "killed" if the test suite detects a statistically significant behavioral change.
Statistical Mutation Killing
Unlike classical mutation testing with deterministic outcomes, QNN measurements are inherently probabilistic. QuanForge uses statistical hypothesis testing to determine if a mutant is killed:
- Run both the original QNN and the mutated QNN on the same test input
- Collect measurement outcome distributions from both
- Apply a statistical test (e.g., chi-squared, Kolmogorov-Smirnov, or Wasserstein distance) to compare distributions
- If the distributions differ significantly (p-value < significance level α, typically 0.05), the mutant is killed
- Otherwise, the mutant survives
This approach accounts for quantum measurement stochasticity and provides a reliable, repeatable killing criterion.
Key formula: The Mutation Score (MS) is calculated as:
MS = (Number of killed mutants) / (Total number of mutants) × 100%
A higher mutation score indicates a more effective test suite and a more thoroughly validated QNN.
The 9 Mutation Operators
QuanForge defines nine post-training mutation operators at two levels. These operators simulate realistic errors that may occur during quantum circuit execution.
Gate-Level Operators (5 operators)
| # | Operator | Description | Effect |
|---|---|---|---|
| G1 | Gate Omission | Remove a randomly selected gate from the circuit | Simulates gate execution failure |
| G2 | Gate Insertion | Insert a random gate at a random position | Simulates spurious gate operations |
| G3 | Gate Replacement | Replace one gate type with another (e.g., H → I, CNOT → CZ) | Simulates gate misconfiguration |
| G4 | Gate Duplication | Duplicate an existing gate (e.g., two consecutive H gates) | Simulates redundant operations |
| G5 | Gate Reordering | Swap the positions of two adjacent gates | Simulates timing/ordering errors |
Parameter-Level Operators (3 operators)
| # | Operator | Description | Effect |
|---|---|---|---|
| P1 | Parameter Perturbation | Add random noise to rotation angles (θ → θ + δ) | Simulates calibration errors |
| P2 | Parameter Dropout | Set a parameter to zero or a fixed value | Simulates parameter freeze/failure |
| P3 | Parameter Scaling | Multiply a parameter by a random factor | Simulates systematic scaling errors |
Measurement-Level Operator (1 operator)
| # | Operator | Description | Effect |
|---|---|---|---|
| M1 | Measurement Alteration | Change measurement basis or insert additional measurements | Simulates measurement errors |
Mutant Generation Algorithm
QuanForge uses a systematic algorithm to generate effective mutants:
Algorithm: QuanForge Mutant Generation
Input: Trained QNN circuit C, operator set O, mutation count N
Output: Set of mutants M
1. M ← ∅
2. For each operator op in O:
3. For i = 1 to N_per_operator:
4. C_mutant ← Apply(C, op, parameters)
5. If C_mutant ≠ C: # Ensure non-equivalent mutant
6. M ← M ∪ {C_mutant}
7. Else:
8. Retry with different parameters
9. Return M
Guidelines for effective mutant generation:
- Generate mutants for each operator independently (single-mutant strategy)
- Ensure mutants are non-equivalent — the mutation must change circuit behavior
- Use stratified sampling across circuit layers and qubits
- Adjust perturbation magnitude based on the circuit's parameter ranges
- Generate sufficient mutants per operator (typically 10-50) for statistical reliability
Workflow: QNN Mutation Testing
Step 1: Prepare the Trained QNN
Ensure the QNN is fully trained and its parameters are fixed. The mutation testing framework operates on the post-training circuit.
# Example: Load a trained QNN (PennyLane / Qiskit style)
import pennylane as qml
from pennylane import numpy as np
# Load trained parameters
params = np.load("trained_params.npy")
n_qubits = params.shape[1]
n_layers = params.shape[0]
Step 2: Select Mutation Operators
Choose which operators to apply based on the analysis goals:
- Comprehensive analysis: Apply all 9 operators
- Gate-level focus: Use G1-G5 to assess circuit topology robustness
- Parameter-level focus: Use P1-P3 to assess parameter sensitivity
- Hardware-aware: Prioritize operators that map to realistic hardware errors
Step 3: Generate Mutants
Apply each operator to produce mutants. Track which operator generated each mutant for later analysis.
def apply_gate_omission(circuit, gate_index):
"""G1: Remove gate at specified index."""
mutated = circuit.copy()
mutated.remove_gate(gate_index)
return mutated
def apply_parameter_perturbation(params, param_index, delta):
"""P1: Add noise to a parameter."""
perturbed = params.copy()
perturbed[param_index] += delta
return perturbed
def apply_gate_replacement(circuit, gate_index, new_gate):
"""G3: Replace gate type (e.g., Hadamard → Identity)."""
mutated = circuit.copy()
mutated.replace_gate(gate_index, new_gate)
return mutated
Step 4: Execute Tests on Original and Mutants
Run both the original QNN and each mutant on the test suite, collecting output distributions.
def run_qnn_test(circuit, params, test_data, n_shots=1024):
"""Run QNN on test data and collect measurement distributions."""
results = []
for sample in test_data:
# Execute circuit with n_shots
outcome = execute_circuit(circuit, params, sample, shots=n_shots)
results.append(outcome)
return results
# Test original
original_results = run_qnn_test(original_circuit, original_params, test_data)
# Test each mutant
for mutant in mutants:
mutant_results = run_qnn_test(mutant.circuit, mutant.params, test_data)
Step 5: Statistical Mutation Killing
Apply statistical tests to determine if each mutant is killed.
from scipy import stats
def is_mutant_killed(original_dist, mutant_dist, alpha=0.05):
"""Determine if mutant is killed using statistical test.
Uses chi-squared test for categorical distributions.
Returns (killed: bool, p_value: float, test_statistic: float)
"""
# Ensure distributions have same bins
stat, p_value = stats.chisquare(mutant_dist, f_exp=original_dist)
killed = p_value < alpha
return killed, p_value, stat
# Evaluate all mutants
killed_count = 0
results_by_operator = {}
for mutant in mutants:
killed, p_val, stat = is_mutant_killed(
original_results, mutant_results[mutant.id]
)
if killed:
killed_count += 1
op_type = mutant.operator_type
if op_type not in results_by_operator:
results_by_operator[op_type] = {"killed": 0, "total": 0}
results_by_operator[op_type]["total"] += 1
if killed:
results_by_operator[op_type]["killed"] += 1
Step 6: Analyze Results
Calculate mutation scores and analyze operator-level performance.
# Overall mutation score
mutation_score = killed_count / len(mutants) * 100
# Per-operator analysis
for op, data in results_by_operator.items():
op_score = data["killed"] / data["total"] * 100
print(f"{op}: {op_score:.1f}% killing rate ({data['killed']}/{data['total']})")
# Layer-level vulnerability analysis
layer_vulnerability = {}
for mutant in mutants:
if not mutant.killed:
layer = mutant.target_layer
layer_vulnerability[layer] = layer_vulnerability.get(layer, 0) + 1
print("Most vulnerable layers (most surviving mutants):")
for layer, count in sorted(layer_vulnerability.items(), key=lambda x: -x[1]):
print(f" Layer {layer}: {count} surviving mutants")
Key Patterns and Techniques
Pattern 1: Gate Replacement Strategies
When replacing gates, use semantically meaningful substitutions:
- Hadamard → Identity (H → I): Removes superposition capability
- CNOT → CZ: Changes entanglement mechanism
- RX → RZ: Changes rotation axis
- Pauli-X → Pauli-Z: Flips bit-flip to phase-flip
- Toffoli → CNOT: Reduces multi-qubit gate complexity
Pattern 2: Parameter Perturbation Magnitudes
Choose perturbation δ based on the context:
- Small perturbations (δ ≈ 0.01-0.1): Simulate calibration noise
- Medium perturbations (δ ≈ 0.1-0.5): Simulate drift errors
- Large perturbations (δ ≈ 0.5-π): Simulate gross misconfiguration
Pattern 3: Multi-Shot Strategy
Use sufficient shots (n_shots ≥ 1024) for reliable statistical killing:
- Fewer shots → higher variance → more false negatives (surviving mutants)
- More shots → better statistical power → more accurate killing
- Balance shots vs. computational cost based on available resources
Pattern 4: Layer-by-Layer Vulnerability Mapping
To localize circuit weaknesses:
- Group mutants by which layer they target
- Calculate per-layer killing rates
- Layers with low killing rates are more robust (or tests are insufficient)
- Layers with high killing rates indicate structural vulnerability
Pattern 5: Test Suite Comparison
To compare two test suites A and B:
- Generate the same mutant set for both
- Calculate MS_A and MS_B separately
- Higher mutation score indicates a more effective test suite
- Use per-operator breakdown to identify what each suite catches
Interpreting Results
Mutation Score Benchmarks
| Mutation Score | Interpretation |
|---|---|
| < 30% | Test suite is weak; many mutants survive undetected |
| 30-60% | Moderate test coverage; room for improvement |
| 60-80% | Good test coverage; most faults are detected |
| > 80% | Strong test coverage; consider harder mutants or tougher operators |
Operator Effectiveness
- High killing rate operators simulate errors easily detected by current tests
- Low killing rate operators indicate blind spots in the test suite
- Operators with 0% killing rate may be equivalent mutants or indicate the test data lacks sensitivity to that error type
Surviving Mutants Analysis
Surviving mutants reveal:
- Redundant circuit regions where errors don't affect outputs
- Insufficient test data that doesn't exercise certain circuit behaviors
- Over-parameterized circuits where parameter changes are absorbed
Noisy Simulation Mode
To assess practical feasibility on real quantum hardware:
- Add a noise model (depolarizing, amplitude damping, readout error)
- Re-run mutation testing under noise
- Compare mutation scores with and without noise
- Noise typically reduces mutation scores by masking mutation effects
- A large drop indicates the QNN is sensitive to hardware noise
Common Pitfalls
- Equivalent mutants: Some mutations produce functionally identical circuits. Always verify non-equivalence.
- Insufficient shots: Too few measurement shots lead to unreliable statistical killing.
- Single-operator bias: Relying on one operator type gives incomplete coverage. Use all 9.
- Ignoring circuit topology: Mutations should cover all layers and qubits, not just one region.
- Over-interpreting low scores: A low mutation score may indicate a robust circuit, not necessarily a bad test suite.
References
- Paper: "QuanForge: A Mutation Testing Framework for Quantum Neural Networks" (arXiv:2604.20706)
- Core techniques: Statistical hypothesis testing, mutation analysis, quantum circuit simulation
- Recommended frameworks: PennyLane, Qiskit, Cirq for circuit manipulation and execution