name: qcbm description: Skill for understanding, using, and implementing the Quantum Circuit Born Machine (QCBM) for learning discrete probability distributions (Bars-and-Stripes) via the QCBMAlgorithm class.
Quantum Circuit Born Machine (QCBM)
Purpose
QCBM is an unsupervised generative model that uses the Born rule to map a parameterized quantum circuit's measurement outcomes to a probability distribution. It learns to match an arbitrary target distribution by minimizing KL divergence.
Use this skill when you need to:
- Learn and generate samples from a discrete probability distribution using a quantum circuit.
- Demonstrate generative quantum machine learning with Born-rule probabilities.
Overview
- Target: 2×2 Bars-and-Stripes (BAS) distribution over 4 bits (6 valid patterns out of 16).
- Initialize variational parameters $\theta \in \mathbb{R}^{layers \times n}$.
- For each epoch: compute Born-rule probabilities $p_\theta(x)$, compute KL divergence, compute Parameter Shift gradients.
- Update $\theta$ with Adam optimizer.
Prerequisites
- Born rule: measurement probabilities $p_\theta(x) = |\langle x|\psi(\theta)\rangle|^2$.
- KL divergence; Parameter Shift Rule.
- Adam optimizer.
torch,numpy,Circuit.
Using the Provided Implementation
from unitarylab_algorithms import QCBMAlgorithm
algo = QCBMAlgorithm(text_mode="plain")
result = algo.run(
n=4,
layers=4,
epochs=40,
lr=0.1,
backend='torch'
)
print(f"Final KL Loss: {result['Final KL Loss']:.4f}")
print(f"Circuit path: {result['circuit_path']}")
for f in result['plot']:
print(f"Output file: {f['filename']} ({f['format']})")
Core Parameters Explained
Constructor
| Parameter | Type | Default | Description |
|---|---|---|---|
text_mode |
str |
'plain' |
Output text rendering mode. |
algo_dir |
str|None |
None |
Output directory for plots and circuit. Auto-derived from CWD if None. |
run() Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
n |
int |
4 |
Number of qubits. BAS target is fixed at 4 qubits. |
layers |
int |
4 |
Number of variational layers. |
epochs |
int |
40 |
Training epochs. |
lr |
float |
0.1 |
Adam optimizer learning rate. |
backend |
str |
'torch' |
Simulation backend. |
device |
str |
'cpu' |
Compute device. |
dtype |
type |
np.complex128 |
State vector dtype. |
Return Fields
| Key | Type | Description |
|---|---|---|
status |
str |
'ok' on success, 'failed' otherwise. |
circuit_path |
str |
Path to saved circuit diagram (SVG). |
circuit |
Circuit |
Ansatz Circuit object (visualization only). |
plot |
List[Dict] |
List of output file dicts, each with 'format' (file extension) and 'filename' (absolute path). Contains 3 files: KL loss curve SVG, distribution comparison SVG, BAS samples SVG. |
Final KL Loss |
float |
KL divergence value at the last training epoch. |
Quantum Computation Time |
float |
Wall-clock seconds for the training loop. |
Implementation Architecture
QCBMAlgorithm in algorithm.py trains a variational quantum circuit to learn a discrete probability distribution (2×2 Bars and Stripes) using KL divergence minimization and the Parameter Shift Rule.
run(n, layers, epochs, lr, backend, device, dtype) — Five Stages:
| Stage | Code Action | Algorithmic Role |
|---|---|---|
| 1 — Initialization | _get_bas_dist(n) → target_probs; theta = nn.Parameter(torch.rand((layers, n))*2π); Adam optimizer |
BAS target distribution and random parameter init |
| 2 — Circuit Preview | _build_circuit(theta.detach(), n) |
Visualization only — not used for training |
| 3 — Training Loop | Per epoch: curr_probs = _get_probs(theta, ...); manual KL loss; parameter shift per (l,q): _get_probs(th_p/th_m, ...); grad = 0.5*(p_p - p_m); optimizer.step() |
Full quantum gradient-based KL minimization |
| 4 — Evaluation | _get_probs(theta, n) with final params → final_probs |
Final distribution capture |
| 5 — Export | _generate_all_outputs(n, target, final, history, algo_dir) |
Saves 3 SVG plots: KL loss curve, distribution comparison bar chart, BAS sample grid |
Helper Methods:
_get_bas_dist(n)— Hardcoded BAS distribution:valid = [0, 3, 5, 10, 12, 15]; returns(tensor, valid)with uniform probability1/6over these 6 states and0elsewhere._build_circuit(theta, n)—Circuit(n); per layer $l$:ry(theta[l,q], q)for all qubits (RY gates); then ringcx(q, (q+1)%n)for all qubits (skipped in last layer). Returns aCircuitobject._get_probs(theta, n)— Calls_build_circuit, executes withinitial_state=|0⟩usingself.backend,self.device,self.dtype, reads.statefrom the result, converts|ψ|²to a torch tensor of length2^n.
Data flow: _get_bas_dist() → target_probs → training loop → _get_probs() (current + shift±) → KL gradient per (l,q) → Adam step → final_probs → _generate_all_outputs() → result dict.
Understanding the Key Quantum Components
1. Born Rule Probability
The probability of measuring basis state $|x\rangle$: $$p_\theta(x) = |\langle x|U(\theta)|0^{\otimes n}\rangle|^2$$ This is the squared amplitude — the uniquely quantum mechanism. All $2^n$ probabilities sum to 1 by unitarity.
2. Variational Circuit Ansatz
Each layer applies:
Per-qubit: Ry(θ[l,q])
Entanglement: CNOT(q, q+1 mod n_qubits) [ring topology]
The ring entanglement generates long-range correlations needed to represent BAS patterns.
3. 2×2 Bars-and-Stripes Target Distribution
The BAS dataset encodes 2×2 binary images where columns or rows are uniformly ON or OFF:
- All-rows:
0000,1111(all off / all on) - Row patterns:
0011,1100 - Column patterns:
0101,1010
The target is uniform over these 6 patterns: $p_{\text{target}}(x) = 1/6$ for valid patterns, 0 otherwise.
4. KL Divergence Loss
$$\mathcal{L} = D_{\text{KL}}(p_{\text{target}} | p_\theta) = \sum_x p_{\text{target}}(x) \log\frac{p_{\text{target}}(x)}{p_\theta(x) + \epsilon}$$
5. Parameter Shift Gradient
$$\frac{\partial \mathcal{L}}{\partial \theta_{l,q}} = \frac{1}{2}\left[\mathcal{L}(\theta_{l,q}+\pi/2) - \mathcal{L}(\theta_{l,q}-\pi/2)\right]$$
Theory-to-Code Mapping
| README / Theory Concept | Code Object or Location |
|---|---|
| Target distribution $\pi$ (BAS) | _get_bas_dist() — states [0,3,5,10,12,15], each prob 1/6 |
| Variational circuit $U(\theta)$ | _build_circuit(theta, n) — RY layers + ring CNOT entanglement |
| Born rule probabilities $p_k = | \langle k |
| KL divergence $D_{\text{KL}}(\pi | p_\theta)$ | sum(target * log((target+ε)/(curr+ε))) in training loop |
| Parameter shift rule $\partial_\theta \mathcal{L}$ | grad = 0.5*(p_plus - p_minus) for each (l,q) index |
| KL gradient wrt $\theta_{l,q}$ | grad_theta[l,q] = sum(-(target/(curr+ε)) * grad_p) |
| Adam optimizer update | torch.optim.Adam([theta], lr=lr) |
| BAS valid states | [0(0000), 3(0011), 5(0101), 10(1010), 12(1100), 15(1111)] |
| Note — gate type | Circuit uses ry(theta[l,q], q) (RY gates) |
Mathematical Deep Dive
State: $|\psi(\theta)\rangle = U(\theta)|0^n\rangle = \prod_{l=1}^{L}[\text{CX-ring} \cdot \bigotimes_q R_y(\theta_{l,q})]|0^n\rangle$.
Probabilities: $\mathbf{p}\theta = (p\theta(0), \ldots, p_\theta(2^n-1))$ where $\sum_x p_\theta(x) = 1$.
Information-theoretic convergence: As $\mathcal{L} \to 0$, $p_\theta \to p_{\text{target}}$ in total variation distance.
Hands-On Example
from unitarylab_algorithms import QCBMAlgorithm
for text_mode in ["plain", "legacy"]:
algo = QCBMAlgorithm(text_mode=text_mode)
result = algo.run(n=4, layers=6, epochs=60, lr=0.08)
final_loss = result['Final KL Loss']
print(f"text_mode={text_mode}: final KL = {final_loss:.4f}")
print(f"Outputs: {[f['filename'] for f in result['plot']]}")
Implementing Your Own Version
The following skeleton reconstructs the QCBM circuit builder, Born-rule probability extraction, and Parameter Shift gradient loop.
# Simplified reconstruction — mirrors QCBMAlgorithm._build_circuit(), _get_probs(), training loop
import numpy as np
import torch
from unitarylab import Circuit
def build_circuit(theta: torch.Tensor, n: int) -> Circuit:
"""Ry-layer + CNOT-ring architecture, layers = theta.shape[0]."""
qc = Circuit(n)
for l in range(theta.shape[0]):
for q in range(n):
qc.ry(float(theta[l, q]), q)
if l < theta.shape[0] - 1:
for q in range(n):
qc.cx(q, (q + 1) % n) # ring entanglement
return qc
def get_probs(theta: torch.Tensor, n: int,
backend: str = 'torch', device: str = 'cpu',
dtype=np.complex128) -> torch.Tensor:
"""Execute circuit and return Born-rule probability vector (length 2^n)."""
qc = build_circuit(theta, n)
psi0 = np.zeros((2**n, 1), dtype=np.complex128)
psi0[0, 0] = 1.0
result = qc.execute(initial_state=psi0, backend=backend, device=device, dtype=dtype)
amplitudes = np.asarray(result.state).flatten()
return torch.as_tensor(np.abs(amplitudes)**2)
def train_qcbm(target_probs: torch.Tensor, n: int = 4,
layers: int = 4, epochs: int = 40, lr: float = 0.1,
backend: str = 'torch') -> torch.Tensor:
"""Full KL-divergence training loop with Parameter Shift gradients."""
theta = torch.nn.Parameter(torch.rand((layers, n)) * 2 * np.pi)
optimizer = torch.optim.Adam([theta], lr=lr)
shift = np.pi / 2
eps = 1e-12
for ep in range(1, epochs + 1):
curr_probs = get_probs(theta.detach(), n, backend)
kl_loss = torch.sum(target_probs * torch.log((target_probs + eps) / (curr_probs + eps)))
# Parameter Shift gradients
grad = torch.zeros_like(theta)
for l in range(layers):
for q in range(n):
th_p = theta.detach().clone(); th_p[l, q] += shift
th_m = theta.detach().clone(); th_m[l, q] -= shift
p_p = get_probs(th_p, n, backend)
p_m = get_probs(th_m, n, backend)
dp = 0.5 * (p_p - p_m) # gradient of prob w.r.t. theta[l,q]
grad[l, q] = torch.sum(-(target_probs / (curr_probs + eps)) * dp)
optimizer.zero_grad(); theta.grad = grad; optimizer.step()
return theta.detach()
Component roles:
build_circuit— faithfully mirrors_build_circuit(theta, n): per-layer RY rotations on all qubits, followed by a CNOT ring (except on the last layer).get_probs— mirrors_get_probs(theta, n): executes from $|0\rangle^{\otimes n}$, reads.statefrom the execution result, and returns $|\langle x|\psi(\theta)\rangle|^2$ for all $x$.train_qcbm— mirrors the training loop: KL divergence as the loss, Parameter Shift Rule for exact analytical gradients, Adam optimizer.
Debugging Tips
- KL divergence not decreasing: Increase
layers(6+) orepochs(80+). For BAS with 4 qubits, at least 4 layers are needed. n != 4: The BAS target distribution is hard-coded for 4 qubits. Changingnchanges the circuit but the target distribution remains 4-qubit BAS.- Slow training: Parameter Shift requires $2 \times layers \times n$ circuit evaluations per epoch. For
layers=4, n=4, this is 32 circuits per epoch. lrtoo large: May cause KL to oscillate or diverge. Uselr=0.05–0.15.- Numerical stability: The KL divergence adds
eps=1e-12to avoidlog(0). If probabilities collapse to 0 for some states, consider entropy regularization.