pm-signal-s9-integration - SKILL.md Agent Skill

name: pm-signal-s9-integration description: Integrate external quantitative models with PM Signal Engine as S9 Historical Edge signal. Pattern for comparing model predictions to Polymarket market prices to generate trading signals. version: 1.0.0 author: Hermes Agent tags: [pm-signal-engine, polymarket, prediction-markets, arbitrage, trading-signals, historical-edge]

PM Signal Engine S9 Integration

Integrate external quantitative models (simulations, ML predictions) with the PM Signal Engine's 8-signal framework as Signal S9: Historical Edge.

This pattern compares model-estimated probabilities to live Polymarket market prices, generating arbitrage signals with Kelly criterion sizing.

Use Cases

WC 2026 winner probabilities from tournament simulation
Election forecasts from polling models
Sports championship models vs market prices
Any domain where you have a quantitative model that predicts binary outcomes

Architecture

Model Predictions ──┐
                    ├──→ Signal Provider ──→ PM Signal Engine (S9)
Polymarket Prices ──┘

Signal Cluster: STRUCTURAL (Cross-market + Historical patterns) Fire Condition: |Model P - Market P| > threshold (typically 2%) Output Format: SignalResult compatible with PM Signal Engine

Implementation Pattern

1. Model Loading

Load model probabilities from JSON/simulation output with fallback strategies:

class SignalProvider:
    def _load_model(self, path: str):
        """Load model predictions with uncertainty estimates"""
        try:
            # Try primary advancement model first (v3_advancement_probs.json)
            with open(path, 'r') as f:
                data = json.load(f)
            
            self.model_probs = {}
            self.model_stdev = {}  # For confidence weighting
            
            for team, val in data.items():
                if isinstance(val, dict):
                    self.model_probs[team] = val.get('win', 0)
                    self.model_stdev[team] = val.get('win', 0) * 0.1  # 10% relative std dev
                else:
                    self.model_probs[team] = val
                    self.model_stdev[team] = val * 0.1
            
            print(f"Loaded v3 advancement model with {len(self.model_probs)} teams")
            
        except Exception as e:
            print(f"Warning: Could not load v3 advancement model from {path}: {e}")
            self._load_fallback_model()
    
    def _load_fallback_model(self):
        """Fallback model estimates based on FIFA rankings"""
        self.model_probs = {
            'Brazil': 0.0564, 'Germany': 0.0192, 'Italy': 0.0203, 'Argentina': 0.0564,
            'France': 0.0204, 'England': 0.0339, 'Spain': 0.0346, 'Netherlands': 0.0254,
            'Uruguay': 0.0475, 'Portugal': 0.0464
        }
        self.model_stdev = {team: prob * 0.1 for team, prob in self.model_probs.items()}

2. Market Data Fetching

Use the Polymarket skill to discover and fetch live prices:

from wc_polymarket_feed import WC2026RealTimeFeed

async def fetch_market_odds():
    """Fetch live odds from Polymarket Gamma API"""
    feed = WC2026RealTimeFeed()
    await feed.fetch_current_odds()
    
    # Returns Dict[str, MarketData]
    return feed.current_odds

Discovery Pattern: When markets aren't grouped under a single event (WC 2026 style), use keyword filtering on /markets:

for market in markets:
    if 'win the 2026 fifa world cup' in market['question'].lower():
        team = extract_team(market['question'])
        yes_prob = parse_outcome_price(market['outcomePrices'])

3. Signal Computation

Calculate edge and Kelly-optimal position sizing:

def compute_signal(team: str, model_prob: float, market_prob: float) -> HistoricalEdgeSignal:
    """Compute S9 signal for a single market"""
    edge = model_prob - market_prob
    
    # Signal fire logic
    BUY_THRESHOLD = 0.02   # 2% edge
    SELL_THRESHOLD = -0.02
    
    fired = abs(edge) >= BUY_THRESHOLD
    direction = "BUY" if edge >= BUY_THRESHOLD else ("SELL" if edge <= SELL_THRESHOLD else "NEUTRAL")
    
    # Signal strength (0-1)
    signal_strength = min(abs(edge) / 0.10, 1.0)
    
    # Kelly criterion for position sizing
    if market_prob > 0:
        odds = 1.0 / market_prob
        b = odds - 1
        p = model_prob
        q = 1 - p
        kelly = (b * p - q) / b
        kelly = max(0, min(kelly, 0.25))  # Cap at 25%
    
    # Expected value per dollar
    ev = (model_prob * (1/market_prob - 1)) - ((1 - model_prob) * 1)
    
    return {
        'signal_name': 'historical_edge',
        'fired': fired,
        'direction': direction,
        'strength': signal_strength,
        'team': team,
        'edge': edge,
        'kelly_fraction': kelly,
        'expected_value': ev
    }

4. PM Signal Engine Format

Convert individual signals to cluster-compatible format:

def get_pm_format(self, signals: List[HistoricalEdgeSignal]) -> Dict:
    """Convert to PM Signal Engine SignalResult format"""
    fired = [s for s in signals if s.fired]
    
    if not fired:
        return {'signal_name': 'historical_edge', 'fired': False, ...}
    
    # Aggregate direction
    buys = len([s for s in fired if s.direction == 'BUY'])
    sells = len([s for s in fired if s.direction == 'SELL'])
    
    if buys > sells:
        direction = 'YES'
    elif sells > buys:
        direction = 'NO'
    else:
        direction = 'NEUTRAL'
    
    # Weighted strength
    total_weight = sum(s.strength * s.kelly_fraction for s in fired)
    
    return {
        'signal_name': 'historical_edge',
        'fired': True,
        'strength': min(total_weight, 1.0),
        'direction': direction,
        'confidence': max(s.strength for s in fired),
        'metadata': {
            'signals_generated': len(fired),
            'buy_count': buys,
            'sell_count': sells,
            'avg_edge': sum(s.edge for s in fired) / len(fired)
        }
    }

Hypothesis Testing Framework

Track model predictions vs eventual outcomes for continuous improvement:

class HypothesisTester:
    def record_prediction(self, team: str, model_prob: float, 
                          market_prob: float, timestamp: str):
        """Store prediction for later outcome comparison"""
        
    def evaluate_accuracy(self, actual_outcomes: Dict[str, bool]) -> Dict:
        """Brier score and calibration analysis"""
        # Calculate Brier score
        # Measure calibration (bin predictions vs actual frequency)
        # Identify systematic biases

Monitoring Pipeline

Set up cron jobs for continuous tracking:

Job	Schedule	Purpose
`signal-monitor`	Every 6h	Record market snapshots, track price history
`arbitrage-alert`	Every 2h	Alert on edges >5%
`weekly-report`	Weekly	Brier score, calibration, model refinement suggestions

Key Insights from WC 2026 Implementation

Market Structure Discovery

WC 2026 uses ungrouped binary markets — each team has its own Yes/No market rather than a single market with multiple outcomes. This means:

48 separate markets to track
Probabilities don't sum to 100% (each is an independent binary bet)
Discovery requires searching /markets by question keyword

Probability Parsing

The Gamma API returns outcomePrices as JSON strings. Robust parsing:

def parse_outcome_price(prices_field):
    """Handle both JSON strings and pre-parsed lists"""
    if isinstance(prices_field, str):
        prices = json.loads(prices_field)
    else:
        prices = prices_field
    
    # First element is "Yes" probability
    return float(prices[0]) if prices else 0

Kelly Caution

The Kelly formula can suggest extreme position sizes (>50%) for high-edge opportunities. Always cap: kelly = min(kelly, 0.25) to avoid ruin.

Example Output

TOP TRADE OPPORTUNITIES (≥3% edge)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Rank  Team            Action   Edge%   Model%   Market%   Kelly
──────────────────────────────────────────────────────────────
1     Turkiye         BUY      +6.6%     7.2%      0.7%    6.6%
2     Uruguay         BUY      +3.9%     4.8%      0.9%    3.9%
3     France          SELL    -14.2%     1.7%     16.0%    0.0%
4     Spain           SELL    -13.0%     3.2%     16.2%    0.0%

PM SIGNAL ENGINE OUTPUT (S9)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Signal: historical_edge
Fired: True
Direction: NEUTRAL (equal buy/sell signals)
Strength: 0.06
Confidence: 0.93
Metadata:
  signals_generated: 10
  buy_count: 5
  sell_count: 5
  max_edge_team: France (-14.2%)

Integration with Existing Skills

Skill	Usage
`polymarket`	Fetch live market prices via Gamma API
`pm-signal-engine`	Consume S9 signal in cluster framework
`cronjob`	Schedule monitoring and alerting

References

wc_polymarket_feed.py — Production implementation of Polymarket feed
wc2026_signal_engine_s9.py — S9 Signal Provider implementation
wc2026_monitor.py — Continuous monitoring pipeline