signal-postmortem - SKILL.md Agent Skill

name: signal-postmortem description: Record and analyze post-trade outcomes for signals generated by edge pipeline and other skills. Track false positives, missed opportunities, and regime mismatches. Feed results back to edge-signal-aggregator weights and skill improvement backlog.

Signal Postmortem

Overview

Signal Postmortem records and analyzes the outcomes of trading signals generated by the edge pipeline, screeners, and other skills. It compares predicted edge direction against 5-day and 20-day realized returns, categorizes outcomes (true positive, false positive, missed opportunity, regime mismatch), and generates feedback for edge-signal-aggregator weight adjustments and skill improvement backlog entries.

When to Use

After a trade has been closed and you want to record the outcome
When reviewing a batch of signals that have reached their holding period (5 or 20 days)
To identify systematic false positive patterns from specific skills
To generate feedback for edge-signal-aggregator weight calibration
When building a skill improvement backlog from decision quality metrics
For periodic (weekly/monthly) signal quality audits

Prerequisites

Python 3.9+
FMP API key (optional, for fetching realized returns if not provided manually)
Standard library + requests for API calls
Input: signal records in JSON format (from edge-signal-aggregator or screener outputs)

API Key Setup (Optional)

If you want to automatically fetch price data for return calculations, set up the FMP API key:

export FMP_API_KEY=your_api_key_here

Alternatively, pass the key via command line with --api-key YOUR_KEY. Without an API key, you can still record outcomes manually by providing --exit-price and --exit-date.

Workflow

Step 1: Prepare Signal Records

Gather closed or matured signal records. Each record should include:

signal_id: Unique identifier
ticker: Stock symbol
signal_date: Date signal was generated
predicted_direction: LONG or SHORT
source_skill: Which skill generated the signal
entry_price: Price at signal generation (optional, for manual override)

# Example: List signals ready for postmortem (5+ days old)
python3 skills/signal-postmortem/scripts/postmortem_recorder.py \
  --list-ready \
  --signals-dir state/signals/ \
  --min-days 5

Step 2: Record Outcomes

Run the postmortem recorder to fetch realized returns and classify outcomes.

python3 skills/signal-postmortem/scripts/postmortem_recorder.py \
  --signals-file state/signals/aggregated_signals_2026-03-10.json \
  --holding-periods 5,20 \
  --output-dir reports/

For manual outcome recording (when price data is already available):

python3 skills/signal-postmortem/scripts/postmortem_recorder.py \
  --signal-id sig_aapl_20260310_abc \
  --exit-price 178.50 \
  --exit-date 2026-03-15 \
  --outcome-notes "Closed at target, +3.2% in 5 days" \
  --output-dir reports/

Step 3: Classify Outcomes

The recorder automatically classifies each signal into one of four categories:

Category	Definition
TRUE_POSITIVE	Predicted direction matched realized return sign
FALSE_POSITIVE	Predicted direction opposite to realized return
MISSED_OPPORTUNITY	Signal not taken but would have been profitable
REGIME_MISMATCH	Signal failed due to market regime change

Classification rules are documented in references/outcome-classification.md.

Step 4: Generate Feedback Files

Generate feedback for downstream consumers:

# Generate weight adjustment suggestions for edge-signal-aggregator
python3 skills/signal-postmortem/scripts/postmortem_analyzer.py \
  --postmortems-dir reports/postmortems/ \
  --generate-weight-feedback \
  --output-dir reports/

# Generate skill improvement backlog entries
python3 skills/signal-postmortem/scripts/postmortem_analyzer.py \
  --postmortems-dir reports/postmortems/ \
  --generate-improvement-backlog \
  --output-dir reports/

Step 5: Review Summary Statistics

Generate aggregate statistics by skill, by ticker, and by time period:

python3 skills/signal-postmortem/scripts/postmortem_analyzer.py \
  --postmortems-dir reports/postmortems/ \
  --summary \
  --group-by skill,month \
  --output-dir reports/

Output Format

Postmortem Record (JSON)

{
  "schema_version": "1.0",
  "postmortem_id": "pm_sig_aapl_20260310_abc",
  "signal_id": "sig_aapl_20260310_abc",
  "ticker": "AAPL",
  "signal_date": "2026-03-10",
  "source_skill": "edge-signal-aggregator",
  "predicted_direction": "LONG",
  "entry_price": 172.50,
  "realized_returns": {
    "5d": 0.032,
    "20d": 0.058
  },
  "exit_price": 178.50,
  "exit_date": "2026-03-15",
  "holding_days": 5,
  "outcome_category": "TRUE_POSITIVE",
  "regime_at_signal": "RISK_ON",
  "regime_at_exit": "RISK_ON",
  "outcome_notes": "Clean breakout, held through minor pullback",
  "recorded_at": "2026-03-17T10:30:00Z"
}

Weight Feedback (JSON)

{
  "schema_version": "1.0",
  "generated_at": "2026-03-17T10:35:00Z",
  "analysis_period": {
    "from": "2026-02-01",
    "to": "2026-03-15"
  },
  "skill_adjustments": [
    {
      "skill": "vcp-screener",
      "current_weight": 1.0,
      "suggested_weight": 0.85,
      "reason": "15% false positive rate in RISK_OFF regime",
      "sample_size": 42
    }
  ],
  "confidence": "MEDIUM",
  "min_sample_threshold": 20
}

Skill Improvement Backlog Entry (YAML)

- skill: vcp-screener
  issue_type: false_positive_cluster
  severity: medium
  evidence:
    false_positive_rate: 0.15
    sample_size: 42
    regime_correlation: RISK_OFF
  suggested_action: "Add regime filter or reduce signal confidence in RISK_OFF"
  generated_by: signal-postmortem
  generated_at: "2026-03-17T10:35:00Z"

Summary Report (Markdown)

Reports are saved to reports/ with filenames postmortem_summary_YYYY-MM-DD.md.

Resources

scripts/postmortem_recorder.py -- Records individual signal outcomes
scripts/postmortem_analyzer.py -- Generates feedback and summary statistics
references/outcome-classification.md -- Classification rules and edge cases
references/feedback-integration.md -- How to integrate feedback with downstream skills

Key Principles

Honest Attribution -- Every outcome is attributed to its source skill for accountability
Regime Awareness -- Regime context is recorded to distinguish skill failure from market regime shifts
Minimum Sample Size -- Weight adjustments require 20+ signals for statistical validity
Feedback Loop Closure -- Results flow back to improve both signal aggregation and skill quality