name: superset-and-metrics-serving description: Guides agents through Superset and metrics-serving workflows. Use when publishing governed metrics to Superset, defining semantic consistency for dashboards, or managing chart-ready analytical datasets.
Superset And Metrics Serving
Overview
Use this skill when Apache Superset or a similar BI serving surface is the final consumer layer. It helps agents keep chart-ready data aligned with governed metrics, prevent dashboard drift, and manage the boundary between analytical modeling and visualization safely.
When to Use
- publishing governed datasets into
Supersetfor dashboard consumption - aligning dashboard metrics with centralized semantic definitions
- preventing BI-layer drift from governed metric definitions
- managing access control and row-level security for dashboard datasets
- designing chart-ready schemas that optimize Superset query performance
- operating Superset as part of a broader data platform
Do not use this when the BI tool is self-service only with no governance expectations, or when metrics are exploratory without shared definitions.
Workflow
Define the serving dataset and metric contract. Include:
- which governed datasets or models should be exposed in Superset
- metric definitions: calculation logic, grain, filters, and time dimensions
- who owns each dataset and metric in Superset (matching upstream ownership)
- freshness expectation: how stale can the data be before dashboards mislead?
- access requirements: who can see which datasets and rows?
Design chart-ready schemas for query performance.
- pre-aggregate where possible — Superset queries should not scan raw tables
- define time columns explicitly with consistent timezone handling
- use materialized views or dedicated serving tables for complex metrics
- minimize joins in Superset SQL Lab — push join logic into the modeling layer
- index or partition underlying tables to support common filter patterns
Keep dashboard metrics aligned with centralized definitions.
- metrics in Superset must match the source-of-truth definition (dbt metrics, semantic layer)
- avoid defining calculation logic directly in Superset that diverges from governed models
- use Superset's metric definition layer to reference pre-built aggregations
- when definitions change upstream, propagate changes to Superset metadata
- audit for drift: scheduled comparison between Superset metrics and source definitions
Manage access, roles, and row-level security.
- define Superset roles that map to data classification levels
- implement row-level security (RLS) for multi-tenant or sensitive datasets
- do not rely solely on Superset access control — apply defense in depth at the data layer
- audit access patterns: who views what, and is it appropriate?
- document the access model so security reviews have a clear reference
Separate governed dashboards from exploratory assets.
- mark governed dashboards with certification or trust badges
- define lifecycle for exploratory dashboards: auto-archive after inactivity
- prevent exploratory queries from being mistaken for official metrics
- establish a promotion path: exploratory → reviewed → certified
- clean up orphaned charts and datasets regularly
Plan operations, monitoring, and incident response.
- monitor Superset query performance and slow dashboard load times
- alert when upstream data freshness falls behind dashboard expectations
- define the incident response when dashboards show wrong numbers
- plan Superset upgrades and database connection changes
- backup dashboard definitions and metadata for disaster recovery
Common Rationalizations
| Rationalization | Reality |
|---|---|
| "Superset is just a visualization tool — governance doesn't apply." | Superset is where stakeholders consume metrics. Ungoverned dashboards spread wrong numbers faster than any other tool. |
| "Users can define their own metrics in SQL Lab." | Self-defined metrics without governance create conflicting numbers across teams. Governed defaults should exist before self-service. |
| "We don't need row-level security — everyone should see everything." | Access requirements change as data sensitivity increases. Building RLS later is much harder than designing it from the start. |
| "Dashboard performance is the BI team's problem." | Slow dashboards are usually caused by missing pre-aggregation or bad schema design upstream. Performance is a pipeline concern, not just a BI concern. |
Red Flags
- metrics defined directly in Superset that contradict the governed semantic layer
- dashboards query raw tables with expensive joins instead of pre-aggregated serving tables
- no certification or trust badge to distinguish governed from exploratory dashboards
- row-level security is not implemented despite multi-tenant or sensitive data
- upstream freshness is not monitored — dashboards show stale data without warning
- orphaned dashboards and datasets accumulate without cleanup
- Superset metadata and definitions have no backup or version control
- metric definitions in Superset are never audited against source-of-truth models
Verification
- Serving datasets and metric definitions are documented with ownership and freshness SLAs
- Chart-ready schemas are designed for query performance (pre-aggregation, indexing)
- Superset metrics align with centralized governed definitions and are audited for drift
- Access control and row-level security are implemented and documented
- Governed dashboards are certified and separated from exploratory assets
- Operations monitoring covers query performance and upstream data freshness
- Dashboard metadata is backed up and recoverable