name: forecast-monitor
description: >
Use this skill when the user wants to turn a business metric forecast or financial model into
a living monitoring system — one where actuals are automatically compared against forecast
drivers, variances surface at the driver level (not just output), and alerts fire based on
sensitivity-derived thresholds. Trigger whenever the user has a forecast model (spreadsheet,
doc, structured data) and wants to: (1) build a variance monitoring system from it, (2) define
driver-level alerts, (3) reduce the manual work of reconciling actuals vs forecast, or (4)
avoid rebuilding dashboards every time the model changes. Also trigger for phrases like
"model-driven monitoring", "forecast to monitoring pipeline", "driver-based alerts",
"automate variance analysis", or "connect my model to a BI tool".
Forecast-to-Monitor Skill
Purpose
Given a business metric forecast (revenue, GMV, DAU, etc.) built on an explicit driver structure,
this skill guides an agent to construct a model-driven monitoring system that:
Tracks actuals against forecast at the driver level, not just the output metric
Derives alert thresholds from the model's own sensitivity analysis
Validates assumption consistency (without replacing the human's job of setting them)
Produces monitoring artefacts appropriate to the available environment
This is a methodology skill — it is tool- and environment-agnostic. The agent selects
output form (BI dashboard, spreadsheet, alerting config, SQL views, etc.) based on what is
available and what the user needs.
Core Concepts
1. The Driver-Output Hierarchy
Every forecast metric is a function of drivers:
Output Metric = f(Driver₁, Driver₂, ... Driverₙ)
Example:
Revenue = Users × ARPU
Users = Users_prev + New_users − Churned_users
ARPU = Avg_price × Avg_usage_per_user
Monitoring only the output metric means variance attribution requires manual re-derivation
every cycle. The goal is to make driver-level variance visible automatically, so the output
variance is already explained when it lands.
2. Driver Classification
Before building any monitoring artefact, classify every driver in the model:
| Class | Definition | Monitoring approach |
|---|---|---|
| Core driver | Structural; present across all business phases (e.g. users, ARPU, churn) | Always monitored; stable definition |
| Phase driver | Stage-specific; valid for current business model only (e.g. a single paid channel's CAC) | Monitored while active; replaceable without touching core |
| Mix-shift driver | Aggregate metric movements caused by segment composition change, not within-segment change | Monitored as structural drift; trend alert, not threshold alert (see below) |
| Assumption | Human-set input; intentional, documented, not derived from data | Not monitored as a live metric; validated for consistency only |
| Output | Derived result of drivers × assumptions | Monitored as a lagging signal; explained by drivers |
Assumptions are human territory. The monitoring system does not track assumption values
as live metrics. It may validate that actuals imply an assumption violation (e.g. "observed
ARPU is inconsistent with the pricing assumption"), but it does not attempt to auto-update
or override assumptions.
3. The Semantic Layer Dependency
This skill assumes a metric definition layer exists or will be created — a single source
of truth for how each driver is computed from raw data. This might be:
A dbt metrics layer
A shared SQL view library
A named range or formula definition sheet (in a spreadsheet context)
A config file consumed by a BI tool
**Every driver in the monitoring system must resolve to exactly one definition in the semantic
layer.** If two definitions exist for the same driver, resolve the conflict before proceeding.
Ambiguous definitions produce unreliable variance, which defeats the purpose.
4. Mix-Shift Drivers
Mix shift is a distinct variance source: an aggregate metric (e.g. ARPU, margin rate, CAC)
moves not because any segment changed, but because the composition of segments changed.
It must be decomposed separately from within-segment movement, otherwise the two effects
mask each other.
Decomposition structure:
Aggregate metric M = Σᵢ (share_i × M_i)
Variance of M = within-segment effect + mix-shift effect
within-segment effect = Σᵢ share_i_forecast × (M_i_actual − M_i_forecast)
mix-shift effect = Σᵢ (share_i_actual − share_i_forecast) × M_i_forecast
When to classify a driver as mix-shift:
The model contains segment-level breakdowns of an aggregate metric, OR
The user explicitly needs to monitor structural composition drift, AND
The semantic layer has segment-level definitions available
If neither condition holds, do not force mix-shift decomposition — monitor the aggregate
driver as a core driver and note that unexplained residuals may have mix-shift origins.
Monitoring logic for mix-shift drivers differs from standard drivers:
| Aspect | Standard driver | Mix-shift driver |
|---|---|---|
| Alert type | Drift alert (single-period threshold) | Trend alert (N consecutive periods moving same direction) |
| Threshold basis | Elasticity-derived % threshold | Directional: share moving consistently toward low-value segments |
| Frequency | Per driver's data cadence | Same cadence, but evaluated as a rolling window |
| Message content | Variance + implied output impact | Which segment is growing/shrinking + implied ARPU / margin drag |
Driver inventory entry for mix-shift:
- driver: arpu_mix_shift
class: mix-shift
segments: [enterprise, smb, consumer]
segment_definitions: [semantic_layer_ref]
data_source: known
frequency: monthly
elasticity_to_output: medium
alert_type: trend
trend_window: 3 # flag if mix drifting same direction for 3+ periods
Agent Workflow
Phase 0: Parse the Forecast Model
Before building anything, extract the model's structure:
Identify the target output metric(s) — what the model is ultimately forecasting
Extract the driver tree — the full decomposition from output to leaf drivers
Separate drivers from assumptions — drivers have data sources; assumptions are
human-set parameters (growth rates, market share targets, pricing decisions)
Note the time segmentation — which drivers have per-period granularity vs. terminal
assumptions; this determines monitoring frequency per driver
Extract any existing sensitivity analysis — which drivers have elasticity estimates
relative to the output metric
If the model does not have explicit sensitivity analysis, prompt the user or perform a
simple one-factor sensitivity pass before proceeding to Phase 2.
Output of Phase 0: A structured driver inventory (see template below).
Driver Inventory Template
─────────────────────────
metric_name: [e.g. Monthly Revenue]
output_metric: [e.g. revenue_usd]
driver_tree:
- driver: new_users
class: core
data_source: [known / unknown / TBD]
frequency: weekly
elasticity_to_output: high # from sensitivity analysis
- driver: churn_rate
class: core
data_source: [known / unknown / TBD]
frequency: biweekly
elasticity_to_output: high
- driver: avg_price
class: core
data_source: [known / unknown / TBD]
frequency: monthly
elasticity_to_output: medium
- driver: paid_channel_cac
class: phase
data_source: [known / unknown / TBD]
frequency: weekly
elasticity_to_output: low
assumptions:
- name: market_growth_rate
value: 12%
rationale: [as documented in model]
- name: pricing_uplift_y2
value: 5%
rationale: [as documented in model]
Phase 1: Define Alert Thresholds from Sensitivity
This is the critical step that connects the forecast model to the monitoring system —
thresholds are derived from elasticity, not set arbitrarily.
Principle: A driver with high elasticity to the output warrants a tighter alert threshold
and higher monitoring frequency. A low-elasticity driver can tolerate larger variance before
action is warranted.
Threshold derivation logic:
For each driver d with elasticity e_d (defined as % change in output / % change in d):
acceptable_output_variance_budget = T (e.g. 5% tolerable output miss)
driver_threshold(d) = T / e_d
Example:
T = 5%, e_churn = 8x → churn_threshold = 5% / 8 = 0.6pp
T = 5%, e_price = 2x → price_threshold = 5% / 2 = 2.5%
If elasticity values are ordinal (high/medium/low) rather than numeric, use the following
defaults as a starting point:
| Elasticity | Alert threshold | Monitoring frequency |
|---|---|---|
| High | ±5% of forecast (or ±0.5pp for rates) | Weekly or daily |
| Medium | ±10–15% of forecast | Biweekly |
| Low | ±20–25% of forecast | Monthly |
These defaults should be overridden with numeric elasticity wherever it exists.
Alert types:
Drift alert: driver is outside threshold for the current period
Trend alert: driver has moved in the same direction for N consecutive periods
(catches slow deterioration that single-period thresholds miss)
Assumption violation signal: actual implied value of an assumption is inconsistent
with the model's stated assumption (e.g. implied churn rate from actuals is 4.2% vs
model assumption of 2.5%); this is surfaced as a flag, not an alert — it informs the
human that an assumption may need revisiting
Phase 2: Map Drivers to Data Sources
For each core and phase driver, resolve its data source in the semantic layer:
For each driver in the inventory:
1. Confirm the definition exists in the semantic layer
2. If it does not exist: flag it and surface the definition needed
3. If multiple definitions exist: flag the conflict and pause until resolved
4. Note update cadence of the underlying data source
(this caps the monitoring frequency — can't monitor weekly if data is monthly)
Do not proceed to Phase 3 until every monitored driver has exactly one resolved definition.
Unresolved drivers should be listed explicitly so the user can action them.
Phase 3: Design the Monitoring Artefact
The form of the monitoring system depends on the available environment. Select or combine:
Option A — BI Dashboard Layer
When a BI tool (Looker, Metabase, Tableau, etc.) is available:
Create one view per driver that joins forecast values to actuals
Add a variance column:
(actual - forecast) / forecastAdd a threshold flag column:
abs(variance) > threshold → TRUECompose a summary view that shows: output metric variance, then driver variances ranked
by elasticity (highest impact drivers first)
Alerts can be configured as BI tool notifications or scheduled report thresholds
Structure the dashboard so output metric is the last thing shown, not the first.
The driver panel should explain the output variance before the user even reads the output row.
Option B — Spreadsheet / Google Sheets
When operating in a spreadsheet environment:
One tab: Forecast (human-maintained, locked for editing except assumption cells)
One tab: Actuals input (data entry or auto-populated via connector)
One tab: Variance (computed, never manually edited)
Column structure:
driver | forecast | actual | variance_abs | variance_pct | threshold | flagRows sorted by elasticity descending
Conditional formatting: flag cells red when
abs(variance_pct) > threshold
One tab: Assumption consistency (derived checks, e.g. implied churn vs assumed churn)
The Forecast tab is the single source of both the driver tree and the threshold values.
When the model is updated, only the Forecast tab changes; Variance and Assumption tabs
recompute automatically.
Option C — Alerting / Notification System
When the goal is proactive notification rather than a dashboard:
- Define alerts as structured config (YAML, JSON, or equivalent):
alerts:
- driver: churn_rate
elasticity: high
threshold_pct: 0.6 # in percentage points for rates
frequency: weekly
channel: slack # or email, pagerduty, etc.
message_template: >
Churn rate actual ({actual}%) is {variance}pp vs forecast ({forecast}%).
At current elasticity, this implies ~{output_impact}% revenue miss.
- driver: new_users
elasticity: high
threshold_pct: 5
frequency: weekly
channel: slack
message_template: >
New user adds ({actual}) are {variance_pct}% vs forecast ({forecast}).
Alert messages should always include the implied output impact, not just the driver
variance. This makes the alert actionable without the recipient having to re-derive it.
Option D — SQL Views + Scheduled Query
When operating in a data warehouse environment:
-- Pattern for each driver
CREATE OR REPLACE VIEW monitor_driver_churn AS
SELECT
period,
forecast_churn_rate,
actual_churn_rate,
(actual_churn_rate - forecast_churn_rate) AS variance_pp,
ABS(actual_churn_rate - forecast_churn_rate) > 0.006 AS alert_flag, -- threshold from elasticity
(actual_churn_rate - forecast_churn_rate) * 8 AS implied_rev_impact_x -- elasticity = 8
FROM forecast_actuals_churn;
-- Summary view: all drivers, ranked by elasticity
CREATE OR REPLACE VIEW monitor_summary AS
SELECT 'churn_rate' AS driver, 'high' AS elasticity, variance_pp AS variance, alert_flag, implied_rev_impact_x FROM monitor_driver_churn
UNION ALL
SELECT 'new_users' AS driver, 'high' AS elasticity, ...
UNION ALL
SELECT 'avg_price' AS driver, 'medium' AS elasticity, ...
ORDER BY
CASE elasticity WHEN 'high' THEN 1 WHEN 'medium' THEN 2 ELSE 3 END,
ABS(variance) DESC;
Scheduled queries run on the driver's monitoring frequency; a downstream alert job reads
the summary view and notifies on flagged rows.
Phase 4: Rolling Forecast Integration
The monitoring system must not be static. When the forecast model is updated:
Driver tree changes: If a new driver is added or a phase driver is retired, update
the driver inventory and re-derive thresholds. The semantic layer definition for any
new driver must be resolved before it enters the monitoring system.
Assumption changes: Assumption updates do not require monitoring system changes.
They do require re-running the sensitivity analysis if the elasticity structure may
have changed materially (e.g. the business model shifted).
Threshold recalibration: Thresholds derived from elasticity should be reviewed
whenever the model's sensitivity analysis is updated — not on a fixed calendar.
Model retirement: When the business model changes significantly (new revenue stream,
new pricing structure), the old model should be formally retired rather than patched.
The monitoring system for the old model stays live but is labelled historical.
A new driver inventory is created for the new model.
Phase 5: Assumption Consistency Validation
This is distinct from driver monitoring. The goal is to surface signals that a model
assumption may no longer hold — not to replace it.
For each assumption in the model, define a corresponding observable:
Assumption: churn_rate = 2.5% (model input)
Observable: implied_churn = churned_users / users_prev_period (from actuals)
Validation: IF implied_churn deviates from assumption by > X for Y consecutive periods
→ surface flag: "Observed churn (3.8%) has diverged from model assumption
(2.5%) for 3 consecutive months. Consider revisiting this assumption."
This validation output is a flag for human review, not an automated correction.
The flag should always include: observed value, assumed value, duration of divergence,
and a plain-language description of what this implies for the forecast.
Output Checklist
Before delivering any monitoring artefact, verify:
Every monitored driver resolves to exactly one definition in the semantic layer
Thresholds are derived from sensitivity / elasticity, not set arbitrarily
Output metric variance is explained by driver variances (no unexplained residual
larger than rounding/data-lag)Mix-shift drivers (if present) use trend alerts, not single-period thresholds
Mix-shift decomposition separates within-segment from composition effects
Assumption cells / parameters are clearly separated and not auto-updated by the system
Assumption consistency checks are present for every major model assumption
Alert messages include implied output impact, not just driver variance
The system has a defined update procedure when the forecast model changes
Phase drivers are clearly labelled as replaceable; core drivers are clearly stable
Extension: Analysis Lazy Reuse (Alert-Triggered)
This extension is non-standard and optional. It applies when a driver has previously
been the subject of a one-off deep-dive analysis that decomposed it further than the
regular monitoring layer — and that decomposition logic should be replayable on demand
when the same driver fires an alert again.
It does not run on a schedule. It only activates when a driver alert fires.
When to use this extension
A driver alert has fired in the past, and a human analyst performed a manual
deep-dive to explain it (e.g. churn spike traced to cohort X + region Y)
The decomposition logic from that analysis is available in a reusable form
(SQL, notebook, named calculation, annotated breakdown)
The same driver has now fired again, and the question is: does the prior
decomposition explain the current variance, or is this a new pattern?
What it does
On alert trigger for a registered driver:
Check the analysis registry — does a stored decomposition exist for this driver?
If yes: replay the decomposition logic against current-period actuals
Evaluate explanatory power: does the replayed decomposition account for
≥ X% of the observed variance? (X is user-defined; 80% is a reasonable default)
Output one of two results:
EXPLAINED: prior decomposition accounts for the variance →surface the breakdown with current values, flag it as a known pattern
UNEXPLAINED: prior decomposition does not account for the variance →flag for new manual investigation; do not attempt to auto-diagnose
The extension never auto-diagnoses unexplained variance. Its job is to quickly
rule in or rule out a known pattern, not to replace human analysis.
Analysis registry structure
Store each registered decomposition as a reusable entry:
analysis_registry:
- driver: churn_rate
alert_id: churn-2024-q2 # human-readable reference to original analysis
decomposition_dimensions:
- cohort_month
- acquisition_channel
- region
logic_ref: sql/churn_deep_dive.sql # or notebook path, named view, etc.
explanatory_threshold: 0.80 # flag as EXPLAINED if decomp accounts for ≥80% variance
created: 2024-07-15
created_by: [analyst name / team]
notes: >
Original spike was concentrated in month-3 cohorts from paid social,
APAC region. Logic isolates this cut; if pattern recurs it should show
up in the same dimensions.
- driver: arpu_mix_shift
alert_id: arpu-mix-2024-q3
decomposition_dimensions:
- segment
- product_tier
logic_ref: sql/arpu_mix_breakdown.sql
explanatory_threshold: 0.75
created: 2024-10-02
created_by: [analyst name / team]
Trigger and output format
ON alert_fired(driver: churn_rate):
IF driver IN analysis_registry:
replay(logic_ref)
compute explanatory_power = variance_explained / total_variance
IF explanatory_power >= explanatory_threshold:
OUTPUT:
status: EXPLAINED
pattern_ref: churn-2024-q2
current_breakdown: [replayed decomposition with current values]
variance_explained: 87%
note: "Prior pattern (month-3 cohorts, paid social, APAC) accounts
for 87% of current variance. Review breakdown before escalating."
ELSE:
OUTPUT:
status: UNEXPLAINED
pattern_ref: churn-2024-q2
variance_explained: 31%
note: "Prior decomposition explains only 31% of current variance.
This appears to be a new or different pattern. Manual
investigation recommended."
ELSE:
OUTPUT:
status: NO_PRIOR_ANALYSIS
note: "No registered decomposition for this driver. Manual
investigation required. Consider registering findings
for future lazy reuse."
Registry maintenance
Entries are added manually after a human analyst completes a deep-dive
Entries are not auto-generated by the monitoring system
An entry should be retired when the business context has changed enough
that the original decomposition dimensions are no longer meaningful
(e.g. a region was merged, a product tier was discontinued)
Retired entries are archived, not deleted — historical context has value
What this extension does not do
Run decompositions on a schedule
Auto-generate new decomposition logic
Propose explanations for UNEXPLAINED variance
Update or replace the registered decomposition logic automatically
What This Skill Does Not Do
Set assumptions — growth rates, market share targets, pricing decisions remain
human inputs. The skill may surface that an assumption appears inconsistent with
observed data, but it does not propose new assumption values.
Build the forecast model — this skill takes a model as input. Model construction
(driver selection, assumption rationale, scenario design) is outside scope.
Choose a BI tool — the skill is tool-agnostic. It produces the monitoring logic;
implementation in a specific tool follows that tool's own patterns.
Guarantee forecast accuracy — the monitoring system surfaces variance quickly and
at the right level. It does not improve the model's predictive quality.