name: compliance-drift-evals description: Set up compliance exports, drift detection, evaluations, scoring, and learning analytics license: MIT metadata: author: ucsandman version: "1.0.0" category: analytics
Compliance, Drift, Evaluations & Learning
DashClaw's analytical capabilities for governance evidence, behavioral monitoring, and agent quality tracking.
Compliance Exports
Generate audit-ready evidence bundles for regulatory frameworks.
Supported Frameworks
| Framework | ID | Description |
|---|---|---|
| SOC 2 | soc2 |
Service Organization Control |
| NIST AI RMF | nist-ai-rmf |
AI Risk Management Framework |
| EU AI Act | eu-ai-act |
European AI regulation |
| ISO 42001 | iso42001 |
AI Management System |
Create an Export
// V1 SDK
const exp = await claw.createComplianceExport({
name: 'Q1 2026 SOC 2 Audit',
frameworks: ['soc2'],
format: 'json', // or 'md'
window_days: 90,
include_evidence: true,
include_remediation: true,
include_trends: true
});
# API
curl -X POST "$DASHCLAW_BASE_URL/api/compliance/exports" \
-H "x-api-key: $DASHCLAW_API_KEY" \
-H "Content-Type: application/json" \
-d '{"name":"Q1 Audit","frameworks":["soc2"],"window_days":90}'
Scheduled Exports
await claw.createComplianceSchedule({
name: 'Weekly SOC 2',
frameworks: ['soc2'],
cron_expression: '0 9 * * 1', // Every Monday at 9am
window_days: 7,
include_evidence: true
});
Gap Analysis
const gaps = await claw.analyzeGaps('soc2');
// Returns: missing controls, partial coverage, recommendations
Coverage Trends
const trends = await claw.getComplianceTrends({ framework: 'soc2', limit: 12 });
// Monthly coverage scores over time
Drift Detection
Statistical behavioral drift detection using z-scores. Pure math — no LLM required.
6 Tracked Metrics
| Metric | What It Measures |
|---|---|
risk_score |
Are actions getting riskier? |
confidence |
Is agent confidence dropping? |
duration_ms |
Are actions taking longer? |
cost_estimate |
Are costs increasing? |
tokens_total |
Is token usage growing? |
learning_score |
Is the agent learning? |
Severity Thresholds
| z-score | Severity | Meaning |
|---|---|---|
| ≥ 1.5 | info | Notable deviation |
| ≥ 2.0 | warning | Significant drift |
| ≥ 3.0 | critical | Severe anomaly |
Compute Baselines
// Establish baseline from last 30 days
await claw.computeDriftBaselines({
agent_id: 'my-agent',
lookback_days: 30
});
Detect Drift
const drift = await claw.detectDrift({
agent_id: 'my-agent',
window_days: 7
});
// drift.alerts: [{ metric, z_score, severity, current_value, baseline_mean }]
Acknowledge Alerts
await claw.acknowledgeDriftAlert(alertId);
Monitor Drift Stats
const stats = await claw.getDriftStats({ agent_id: 'my-agent' });
// { total_alerts, unacknowledged, by_severity, by_metric }
Evaluations
Score agent outputs using 5 built-in scorer types.
Scorer Types
| Type | LLM Required | Description |
|---|---|---|
regex |
No | Pattern matching against output |
contains |
No | Keyword/phrase detection |
numeric_range |
No | Value within expected range |
custom_function |
No | Arbitrary JavaScript logic |
llm_judge |
Yes (optional) | LLM-based quality assessment |
Create a Scorer
// Regex scorer — check for PII
await claw.createScorer({
name: 'no-pii-in-output',
scorerType: 'regex',
config: {
pattern: '\\b\\d{3}-\\d{2}-\\d{4}\\b', // SSN pattern
invert: true // Score 1 if NOT found (good)
},
description: 'Ensures no SSN patterns in output'
});
// Numeric range scorer
await claw.createScorer({
name: 'response-time-check',
scorerType: 'numeric_range',
config: {
field: 'duration_ms',
min: 0,
max: 5000
}
});
Score an Action
await claw.createScore({
actionId: 'ar_abc123',
scorerName: 'no-pii-in-output',
score: 1.0, // 0-1 scale
label: 'pass',
reasoning: 'No PII patterns detected'
});
Batch Evaluation Run
const run = await claw.createEvalRun({
name: 'Weekly quality check',
scorerId: 'sc_abc123',
actionFilters: { days: 7 }
});
// Scores all matching actions from the last 7 days
Scoring Profiles
Multi-dimensional risk and quality scoring with auto-calibration.
Create a Profile
await claw.createScoringProfile({
name: 'deploy-quality',
description: 'Quality scoring for deployment actions',
composite_method: 'weighted_average', // or: minimum, geometric_mean
dimensions: [
{
name: 'risk',
weight: 0.4,
source: 'risk_score',
scale: [
{ min: 0, max: 40, label: 'low', score: 1.0 },
{ min: 40, max: 70, label: 'medium', score: 0.6 },
{ min: 70, max: 100, label: 'high', score: 0.2 }
]
},
{
name: 'speed',
weight: 0.3,
source: 'duration_ms',
scale: [
{ min: 0, max: 5000, label: 'fast', score: 1.0 },
{ min: 5000, max: 30000, label: 'normal', score: 0.7 },
{ min: 30000, max: null, label: 'slow', score: 0.3 }
]
},
{
name: 'cost',
weight: 0.3,
source: 'cost_estimate',
scale: [
{ min: 0, max: 1, label: 'cheap', score: 1.0 },
{ min: 1, max: 10, label: 'moderate', score: 0.6 },
{ min: 10, max: null, label: 'expensive', score: 0.2 }
]
}
]
});
Auto-Calibration
const suggestions = await claw.autoCalibrate({
lookback_days: 30
});
// Returns percentile-based scale suggestions from historical data
Risk Templates
Replace hardcoded risk scores with rule-based computation:
await claw.createRiskTemplate({
name: 'deploy-risk',
base_risk: 50,
rules: [
{ field: 'systems_touched', operator: 'contains', value: 'production', add: 30 },
{ field: 'reversible', operator: '==', value: false, add: 20 },
{ field: 'metadata.has_rollback', operator: '==', value: true, add: -15 }
]
});
Learning Analytics
Track agent improvement over time. DashClaw's unique moat.
Maturity Model
| Level | Episodes | Success Rate | Avg Score |
|---|---|---|---|
| Novice | 0+ | any | any |
| Developing | 10+ | 40%+ | 40+ |
| Competent | 50+ | 60%+ | 55+ |
| Proficient | 150+ | 75%+ | 65+ |
| Expert | 500+ | 85%+ | 75+ |
| Master | 1000+ | 92%+ | 85+ |
Compute Learning Velocity
const velocity = await claw.computeLearningVelocity({
agent_id: 'my-agent',
lookback_days: 90,
period: 'weekly'
});
// Linear regression slope of performance over time
Learning Curves
const curves = await claw.computeLearningCurves({
agent_id: 'my-agent',
lookback_days: 180
});
// Per-action-type learning curves showing improvement trajectory
Analytics Summary
const summary = await claw.getLearningAnalyticsSummary({
agent_id: 'my-agent'
});
// { maturity_level, velocity, total_episodes, success_rate, avg_score }