copula-modeling

star 22

Dependency modeling beyond correlation matrices. Gaussian, Student-t, Clayton, and Gumbel copulas for correlated prediction market outcomes. Tail dependence quantification and vine copulas for high-dimensional portfolios. The reason Gaussian copulas failed in 2008 — and what to use instead.

shreyas-lyzr By shreyas-lyzr schedule Updated 3/1/2026

name: copula-modeling description: > Dependency modeling beyond correlation matrices. Gaussian, Student-t, Clayton, and Gumbel copulas for correlated prediction market outcomes. Tail dependence quantification and vine copulas for high-dimensional portfolios. The reason Gaussian copulas failed in 2008 — and what to use instead. license: MIT allowed-tools: Read Edit Grep Glob Bash metadata: author: lyzr version: "1.0.0" category: quantitative-simulation

Copula Modeling — What Correlation Matrices Can't Capture

When to Use

When modeling joint outcomes across multiple correlated prediction markets:

  • Swing state election portfolios (PA, MI, WI, GA, AZ)
  • Correlated policy markets (Fed rate + inflation + employment)
  • Any portfolio where extreme co-movement matters

The Problem: Tail Dependence

Gaussian copula: Tail dependence lambda_U = lambda_L = 0. Extreme co-movements are modeled as having zero probability. This is catastrophically wrong.

Student-t copula: With nu=4 and rho=0.6, tail dependence is ~0.18. An 18% probability of extreme co-movement. Gaussian says 0%.

Clayton copula: Lower tail dependence only (lambda_L = 2^{-1/theta}). When one market crashes, others follow.

Gumbel copula: Upper tail dependence only (lambda_U = 2 - 2^{1/theta}). Correlated positive resolutions.

Sklar's Theorem

F(x_1, ..., x_d) = C(F_1(x_1), ..., F_d(x_d))

where C is the copula (pure dependency structure) and F_i are the marginal CDFs. Model each market's marginal behavior separately, then glue together with a copula.

Full Implementation

import numpy as np
from scipy.stats import norm, t as t_dist

def simulate_correlated_outcomes_gaussian(probs, corr_matrix, N=100_000):
    """Gaussian copula — no tail dependence."""
    d = len(probs)
    L = np.linalg.cholesky(corr_matrix)
    Z = np.random.standard_normal((N, d))
    X = Z @ L.T
    U = norm.cdf(X)
    outcomes = (U < np.array(probs)).astype(int)
    return outcomes

def simulate_correlated_outcomes_t(probs, corr_matrix, nu=4, N=100_000):
    """Student-t copula — symmetric tail dependence."""
    d = len(probs)
    L = np.linalg.cholesky(corr_matrix)
    Z = np.random.standard_normal((N, d))
    X = Z @ L.T

    S = np.random.chisquare(nu, N) / nu
    T = X / np.sqrt(S[:, None])
    U = t_dist.cdf(T, nu)
    outcomes = (U < np.array(probs)).astype(int)
    return outcomes

def simulate_correlated_outcomes_clayton(probs, theta=2.0, N=100_000):
    """Clayton copula — lower tail dependence (Marshall-Olkin algorithm)."""
    V = np.random.gamma(1 / theta, 1, N)
    E = np.random.exponential(1, (N, len(probs)))
    U = (1 + E / V[:, None])**(-1 / theta)
    outcomes = (U < np.array(probs)).astype(int)
    return outcomes


# --- Compare tail behavior across copulas ---
np.random.seed(42)

probs = [0.52, 0.53, 0.51, 0.48, 0.50]  # 5 swing state probabilities
state_names = ['PA', 'MI', 'WI', 'GA', 'AZ']

corr = np.array([
    [1.0, 0.7, 0.7, 0.4, 0.3],
    [0.7, 1.0, 0.8, 0.3, 0.3],
    [0.7, 0.8, 1.0, 0.3, 0.3],
    [0.4, 0.3, 0.3, 1.0, 0.5],
    [0.3, 0.3, 0.3, 0.5, 1.0],
])

N = 500_000

gauss_outcomes = simulate_correlated_outcomes_gaussian(probs, corr, N)
t_outcomes = simulate_correlated_outcomes_t(probs, corr, nu=4, N=N)

# P(sweep all 5 states)
p_sweep_gauss = gauss_outcomes.all(axis=1).mean()
p_sweep_t = t_outcomes.all(axis=1).mean()

# P(lose all 5 states)
p_lose_gauss = (1 - gauss_outcomes).all(axis=1).mean()
p_lose_t = (1 - t_outcomes).all(axis=1).mean()

# If independent
p_sweep_indep = np.prod(probs)
p_lose_indep = np.prod([1 - p for p in probs])

print("Joint Outcome Probabilities:")
print(f"{'':>25}  {'Independent':>12}  {'Gaussian':>12}  {'t-copula':>12}")
print(f"{'P(sweep all 5)':>25}  {p_sweep_indep:>12.4f}  {p_sweep_gauss:>12.4f}  {p_sweep_t:>12.4f}")
print(f"{'P(lose all 5)':>25}  {p_lose_indep:>12.4f}  {p_lose_gauss:>12.4f}  {p_lose_t:>12.4f}")
print(f"\nt-copula increases sweep probability by {p_sweep_t/p_sweep_gauss:.1f}x vs Gaussian")

The t-copula with nu=4 routinely shows 2–5x higher probability of extreme joint outcomes vs Gaussian. Trading correlated contracts without modeling tail dependence means your portfolio will blow up in exactly the scenarios that matter most.

Vine Copulas for d > 5

For high-dimensional portfolios, bivariate copulas are insufficient. Vine copulas decompose d-dimensional dependency into d(d-1)/2 bivariate conditional copulas in a tree structure:

Type Structure Use Case
C-vine (star) One central event drives all Presidential winner -> all policy markets
D-vine (path) Sequential dependencies Primary results -> general election
R-vine (general) Maximum flexibility Complex multi-market portfolios

Construction: Build maximum spanning trees ordered by |tau_Kendall|, select pair-copula families via AIC, estimate sequentially.

Libraries: pyvinecopulib (Python), VineCopula (R).

Copula Selection Guide

Scenario Copula Why
Symmetric moderate correlation Gaussian Simple, fast, no tail dependence
Symmetric with fat tails Student-t (nu=3-6) Captures joint extremes
Crash contagion Clayton (theta=1-5) Lower tail dependence only
Joint positive resolution Gumbel (theta=1.5-4) Upper tail dependence only
Complex asymmetric Frank + rotation Flexible, no tail dependence
d > 5 contracts Vine copula Pair-copula decomposition
Install via CLI
npx skills add https://github.com/shreyas-lyzr/quant-sim --skill copula-modeling
Repository Details
star Stars 22
call_split Forks 4
navigation Branch main
article Path SKILL.md
More from Creator
shreyas-lyzr
shreyas-lyzr Explore all skills →