log-metrics

name: log-metrics description: Query and manage agent metrics in Supabase pm_token_metrics table. CSV is append-only backup.

Log Metrics Skill

Agent metrics are auto-captured by the SubagentStop hook to Supabase pm_token_metrics (primary) and tokens.csv (backup). Manual logging is rarely needed.

How Metrics Flow

SubagentStop hook fires
    │
    ├─ Reads .claude/.current-task (task_id, agent_type, sprint_id)
    ├─ Writes to Supabase via pm_log_agent_metrics RPC (PRIMARY)
    ├─ Writes to tokens.csv (BACKUP — append-only, never queried)
    └─ On Supabase failure: saves to ~/.claude/metrics/failed-payloads.jsonl

PM writes .claude/.current-task before invoking each agent (Step 5 of agent-handoff):

{"task_id": "TASK-2226", "agent_type": "engineer", "sprint_id": "<sprint-uuid>"}

CRITICAL: sprint_id must be the UUID from pm_sprints.id, NOT the sprint name (e.g., "SPRINT-139"). Using the name causes NULL sprint_id in pm_token_metrics. Incident ref: SPRINT-T.

This eliminates the manual --label step — the hook writes task context at insert time.

Querying Metrics (via MCP)

All metric queries use SQL via mcp__supabase__execute_sql:

Task totals (replaces `sum_effort.py`)

SELECT
  task_id,
  SUM(input_tokens) AS input_tokens,
  SUM(output_tokens) AS output_tokens,
  SUM(cache_read_tokens) AS cache_read_tokens,
  SUM(cache_creation_tokens) AS cache_creation_tokens,
  SUM(total_tokens) AS total_tokens,
  SUM(billable_tokens) AS billable_tokens,
  SUM(api_calls) AS api_calls,
  SUM(duration_ms) / 1000 AS duration_secs,
  COUNT(DISTINCT agent_id) AS agent_sessions,
  COUNT(*) AS entries
FROM pm_token_metrics
WHERE task_id = 'TASK-XXXX'
GROUP BY task_id;

Filter by agent type, date, session (replaces `query_metrics.py`)

-- By agent type
SELECT * FROM pm_token_metrics WHERE agent_type = 'engineer' ORDER BY recorded_at DESC;

-- By date range
SELECT * FROM pm_token_metrics WHERE recorded_at >= '2026-03-01' ORDER BY recorded_at;

-- By session
SELECT * FROM pm_token_metrics WHERE session_id = '<uuid>' ORDER BY recorded_at;

-- Combined
SELECT * FROM pm_token_metrics
WHERE task_id = 'TASK-XXXX' AND recorded_at >= '2026-03-01'
ORDER BY recorded_at;

Sprint-level aggregation

SELECT
  sprint_id,
  SUM(total_tokens) AS total_tokens,
  SUM(billable_tokens) AS billable_tokens,
  COUNT(DISTINCT task_id) AS tasks,
  COUNT(DISTINCT agent_id) AS agents
FROM pm_token_metrics
WHERE sprint_id = '<sprint-uuid>'
GROUP BY sprint_id;

Labeling Unlabeled Rows

If the hook fired without .current-task context, use the labeling RPC:

SELECT pm_label_agent_metrics('<agent_id>', 'TASK-XXXX', 'engineer', 'Implementation');

This finds the most recent unlabeled row for that agent_id and sets task_id, agent_type, description, plus resolves task_uuid, backlog_item_id, sprint_id FKs.

Recording Task Totals (Step 14)

PM calls this at Step 14 — it auto-sums from metric rows:

-- Auto-compute from pm_token_metrics (preferred — no manual total needed)
SELECT pm_record_task_tokens('<task_uuid>');

-- Or with explicit total (backward compatible)
SELECT pm_record_task_tokens('<task_uuid>', 150000);

Reconciliation (Step 14 pre-flight)

Before recording task totals, verify all agents logged:

SELECT agent_id, agent_type, total_tokens, task_id
FROM pm_token_metrics
WHERE task_id = 'TASK-XXXX'
ORDER BY recorded_at;

Compare agent_id count against handoff chain. If any are missing, check ~/.claude/metrics/failed-payloads.jsonl for failed pushes.

Gap Detection (Sprint Close)

SELECT t.legacy_id AS task_id, COUNT(m.id) AS metric_rows
FROM pm_tasks t
LEFT JOIN pm_token_metrics m ON m.task_id = t.legacy_id
WHERE t.sprint_id = '<sprint-uuid>' AND t.deleted_at IS NULL
GROUP BY t.legacy_id
HAVING COUNT(m.id) = 0;

Any results indicate tasks that completed with zero metric records.

WARNING: This query joins on legacy_id. If legacy_id is NULL on pm_tasks entries, those tasks silently show zero metric rows (false positive). Ensure all tasks have legacy_id set: UPDATE pm_backlog_items SET legacy_id = 'TASK-' || item_number WHERE sprint_id = '<sprint-uuid>' AND legacy_id IS NULL; Incident ref: SPRINT-T — all tasks had NULL legacy_id, metrics invisible in admin portal.

Deprecated: Python Scripts

The following scripts remain on disk for historical CSV queries but are not used in the current workflow:

Script	Replaced By
`log_metrics.py`	Auto-capture via SubagentStop hook
`log_metrics.py --label`	`pm_label_agent_metrics` RPC
`query_metrics.py`	SQL via MCP
`sum_effort.py`	SQL via MCP / `pm_record_task_tokens` auto-sum

Billable Tokens Formula

Standardized: billable = input + output + cache_create

This is a generated column in Supabase (billable_tokens), computed automatically. Cache reads are excluded (much cheaper, tracked separately as cache_read_tokens).

Storage

Store	Role	Location
Supabase	Primary (source of truth)	`pm_token_metrics` table
CSV	Backup (append-only, never queried)	`.claude/metrics/tokens.csv`
Failed payloads	Replay queue	`~/.claude/metrics/failed-payloads.jsonl`