name: digital-twin-error-propagation-mdp description: "Optimal sequential decision-making for error propagation mitigation in modular digital twins using MDP/POMDP framework. Combines HMM-based latent regime inference with MDP/POMDP intervention selection. Features: data-driven transition model, Point-Based Value Iteration, Value of Information quantification, Gillespie simulation validation. Use for: digital twin maintenance optimization, modular system error propagation, surrogate model degradation, sequential decision under uncertainty in CPS."
Digital Twin Error Propagation MDP/POMDP Framework
Sequential decision-making framework for mitigating error propagation in modular digital twins using HMM-based regime inference and MDP/POMDP optimal intervention selection.
Source Paper
Title: Optimal Sequential Decision-Making for Error Propagation Mitigation in Digital Twins
Authors: Annice Najafi, Shokoufeh Mirzaei
arXiv: 2604.22168v1 [cs.LG]
Date: April 24, 2026
Institution: California State Polytechnic University, Pomona
Core Problem
Modular digital twins using surrogate models (ARX, neural networks) suffer from error propagation where small errors in early modules cascade downstream, degrading system-level estimates and control recommendations.
Framework Architecture
Physical System → Surrogate Modules → Residuals → HMM Regime Inference
↓
MDP/POMDP Controller
↓
Optimal Intervention Selection
Mathematical Formulation
MDP: (S, A, T, R, γ)
States S: {Nominal, SensorNoisy, PlantMismatch, Drift}
Actions A: {NoAction, Recalibrate, UpdateModel, RepairHardware}
Transition Model:
T(s,a,s') = (1-ρ_{a,s}) * T_noaction(s,s') + ρ_{a,s} * I[s'=Nominal]
Where ρ_{a,s} is repair probability from empirical data.
Reward: R(a,s) = Benefit(a,s) - Cost(a) - FidelityPenalty(s)
POMDP Extension
Challenge: True regime unobservable; only HMM classifications available.
Belief Update (Bayesian):
b'(s') = η * O(s',z) * Σ_s T(s,a,s') * b(s)
Where O is confusion matrix from HMM validation.
Solution Methods
Value Iteration (MDP)
V*(s) = max_a [R(a,s) + γ Σ_{s'} T(s,a,s') V*(s')]
Point-Based Value Iteration (POMDP)
- Sample belief points from simplex
- Represent value function as max over α-vectors
- Warm start with MDP solution
Gillespie Simulation
- Continuous-time CTMC embedding
- Exact sample path generation
- Validation against theoretical solution
Key Results
| Method | Performance | Notes |
|---|---|---|
| MDP | 100% (upper bound) | Perfect regime knowledge |
| POMDP | ~95% | Realistic observation noise |
| Q-learning | ~85% | Model-free baseline |
| No intervention | ~50% | Worst case |
Value of Information: Gap between MDP and POMDP quantifies benefit of improved classification.
Implementation
class DigitalTwinErrorController:
def __init__(self, states, actions, gamma=0.95):
self.S = states
self.A = actions
self.gamma = gamma
def set_transition_model(self, A_noaction, repair_probs):
"""Build action-dependent matrices from HMM parameters"""
self.transition_matrices = {}
for a in self.A:
A_a = np.zeros((len(self.S), len(self.S)))
for s_idx, s in enumerate(self.S):
rho = repair_probs.get((a, s), 0.0)
nominal_idx = self.S.index('Nominal')
for s_next_idx in range(len(self.S)):
if s_next_idx == nominal_idx:
A_a[s_idx, s_next_idx] = rho
else:
A_a[s_idx, s_next_idx] = (1 - rho) * A_noaction[s_idx, s_next_idx]
self.transition_matrices[a] = A_a
def value_iteration(self, epsilon=1e-6, max_iter=1000):
"""Solve MDP via value iteration"""
V = np.zeros(len(self.S))
for iteration in range(max_iter):
V_new = np.zeros(len(self.S))
for s_idx in range(len(self.S)):
Q_values = []
for a_idx, a in enumerate(self.A):
A_a = self.transition_matrices[a]
expected_future = np.dot(A_a[s_idx, :], V)
Q = self.R[a_idx, s_idx] + self.gamma * expected_future
Q_values.append(Q)
V_new[s_idx] = max(Q_values)
if np.max(np.abs(V_new - V)) < epsilon:
break
V = V_new
# Extract policy
pi = np.zeros(len(self.S), dtype=int)
for s_idx in range(len(self.S)):
Q_values = [self.R[a_idx, s_idx] + self.gamma *
np.dot(self.transition_matrices[a][s_idx, :], V)
for a_idx, a in enumerate(self.A)]
pi[s_idx] = np.argmax(Q_values)
return V, pi
Activation Keywords
- digital twin error propagation
- MDP maintenance optimization
- POMDP digital twin
- HMM regime inference CPS
- sequential decision cyber-physical
- surrogate model degradation
- value of information maintenance
- belief state control
Reference
Najafi, A., & Mirzaei, S. (2026). Optimal Sequential Decision-Making for Error Propagation Mitigation in Digital Twins. arXiv:2604.22168v1 [cs.LG].