hmm-rl-regime-portfolio-allocation - SKILL.md Agent Skill

name: hmm-rl-regime-portfolio-allocation description: "Regime-based portfolio allocation integrating Hidden Markov Models with Reinforcement Learning. Three-state HMM detects market regimes, RL enhances allocation. Outperforms SPY benchmark with lower drawdowns. arXiv:2605.27848" tags: ["portfolio-management", "hidden-markov-model", "reinforcement-learning", "regime-detection"] arxiv_id: "2605.27848"

HMM-RL Regime-Based Portfolio Allocation

Regime-aware portfolio allocation framework integrating Hidden Markov Models (HMM) with Reinforcement Learning for tactical asset allocation. Published 2026-05-27 (arXiv:2605.27848).

Core Methodology

Two-Stage Framework

Stage 1: Regime Detection via HMM

Characterize market behavior through discrete Markov chain
Estimate Gaussian HMM with states selected by Bayesian Information Criterion (BIC)
Three regimes identified:
1. Low-volatility (stable): SPY dominates
2. Transitional: Mixed allocation
3. High-volatility (stressed): TLT and GLD provide protection

Stage 2: RL-Driven Allocation

RL policy conditioned on HMM-detected regime
30% out-of-sample test window with one-day execution lag (no look-ahead bias)
Assets: SPY (equities), TLT (long-term Treasuries), GLD (gold)

Data & Validation

Daily ETF data from 2004-2025
3-state specification validated via sensitivity analysis against 2-state alternatives
Regimes exhibit strong persistence and state-dependent return dynamics
Consistent with Ardia et al. (2024) and Gupta & Pierdzioch (2023) findings on nonlinear market states

Key Results

Both HMM-based allocations outperform passive SPY benchmark
RL policy achieves highest risk-adjusted performance — strongest Sharpe ratio with materially lower drawdowns
Full interpretability — discrete regime-dependent actions are transparent and explainable
Three-state specification robust — sensitivity analysis confirms superiority over two-state alternatives

Reusable Skill Patterns

Pattern 1: Regime-Conditioned Asset Allocation

For adaptive portfolio management:
1. Use HMM to detect discrete market regimes
2. Select number of states via BIC (not arbitrary)
3. Condition allocation rules on detected regime
4. Validate with out-of-sample testing and execution lag

Pattern 2: HMM-RL Hybrid Architecture

Combining statistical regime detection with RL:
1. HMM provides interpretable regime labels (statistical foundation)
2. RL optimizes allocation within each regime (adaptive learning)
3. Result: transparent + high-performing system
4. Both components are independently auditable

Pattern 3: State-Conditional Asset Dominance

Key insight: different assets dominate in different regimes
- Stable regimes: equities (SPY) outperform
- Stressed regimes: bonds (TLT) and gold (GLD) provide protection
- Allocation rules should flip based on regime, not be static

Pattern 4: Look-Ahead Bias Prevention

Critical for financial backtesting:
1. Use one-day execution lag (signal today → trade tomorrow)
2. 30% strict out-of-sample window
3. No parameter re-optimization on test data

Application Areas

Tactical asset allocation
Market regime detection
Portfolio risk management
Multi-asset class optimization
Systematic trading strategies
Quantitative portfolio management

Pitfalls

BIC selection matters: Arbitrary number of states leads to over/under-fitting
Execution lag is critical: Without it, look-ahead bias inflates results
Regime persistence assumption: Regimes must exhibit sufficient persistence for the approach to work
Three-asset simplicity: SPY/TLT/GLD is a simplified universe; real portfolios need more assets