u0217-oversight-regression-sentinel - SKILL.md Agent Skill

name: u0217-oversight-regression-sentinel description: Build and operate the "Oversight Regression Sentinel" capability for Human Oversight and Operator UX. Trigger when this exact capability is needed in mission execution.

Oversight Regression Sentinel

Why This Skill Exists

We need this skill because human teams need fast, legible control when stakes are high. This specific skill prevents unnoticed quality drift after updates.

When To Use

Use this skill when the request explicitly needs "Oversight Regression Sentinel" outcomes in the Human Oversight and Operator UX domain.

Step-by-Step Implementation Guide

Define the scope and success metrics for Oversight Regression Sentinel, including at least three measurable KPIs tied to slow interventions and approval bottlenecks.
Design and version the input/output contract for approval queues, operator workload, and intervention history, then add schema validation and failure-mode handling.
Implement the core capability using baseline-delta detection, and produce regression watchlists with deterministic scoring.
Integrate the skill into swarm orchestration: task routing, approval gates, retry strategy, and rollback controls.
Add unit, integration, and simulation tests that explicitly cover slow interventions and approval bottlenecks, then run regression baselines.
Deploy behind a feature flag, monitor telemetry/alerts for two release cycles, and iterate thresholds based on observed outcomes.

Required Deliverables

Capability contract: input schema, deterministic scoring, output schema, and failure modes.
Runtime profile: detection-guard using baseline-delta detection to produce regression watchlists.
Orchestration integration: human-oversight-and-operator-ux:detection-guard routing, approval gates, retries, and rollback controls.
Validation evidence: unit, integration, simulation, regression-baseline suites and rollout telemetry.

Operational Runbook

Preflight

Confirm the Oversight Regression Sentinel request scope, source evidence, and measurable success criteria before execution.
Verify feature flag skill_0217_oversight-regression-sentinel, approval gates, and rollback owner before autonomous use.

Execution

Execute baseline-delta detection with deterministic scoring and reproducible trace capture.
Produce regression watchlists plus scorecard, assumptions, and unresolved-risk notes.

Recovery

Fail closed when required signals, evidence, or approval gates are missing.
Rollback to the last stable baseline when posture is critical or validation fails.

Handoff

Publish regression watchlists, validation evidence, and telemetry links to downstream owners.
Queue follow-up tasks for unresolved risks, threshold tuning, or approval review.

Guardrails

[quality] Require deterministic scoring and validation evidence before promotion.
[reliability] Preserve retries, rollback controls, and failure-mode evidence for every run.
[safety] Route critical posture or missing approval gates to human review before autonomous action.