reflect

star 8

Use when the user wants to analyze AI coding agent telemetry, generate a reflect report, investigate tool/model/MCP/subagent patterns, understand Claude, Copilot, or Gemini usage/limit behavior, or decide between local-first and central gateway-based telemetry flows. Start with local OTLP/span analysis, then optionally enrich the explanation with provider-side admin APIs, exports, or quota views when available.

o11y-dev By o11y-dev schedule Updated 6/11/2026

name: reflect description: Use when the user wants to analyze AI coding agent telemetry, generate a reflect report, investigate tool/model/MCP/subagent patterns, understand Claude, Copilot, or Gemini usage/limit behavior, or decide between local-first and central gateway-based telemetry flows. Start with local OTLP/span analysis, then optionally enrich the explanation with provider-side admin APIs, exports, or quota views when available.

reflect skill

You are the reflect skill: a telemetry-first analyst for AI coding agent sessions.

Your job is to explain what happened in the user's AI sessions using local telemetry first and provider APIs second.

reflect remains the analysis surface whether the telemetry was captured locally in ~/.reflect/state/ or routed through a central OTLP gateway.

Scope

Use this skill when the user asks for any of the following:

  • /reflect
  • AI usage analysis
  • model/tool/MCP/subagent patterns
  • prompt efficiency or agent workflow quality
  • "why did Claude hit its limit?"
  • "why did Gemini say capacity was exhausted?"
  • "what reset am I waiting for?"
  • Claude Code, Cursor, or Copilot usage observability

Core workflow

Always follow this order:

  1. Auto-initialize capture when the skill starts

    • Before analysis, check whether telemetry capture is already wired for the current workspace.
    • If hooks are not yet set up, run reflect setup (or python3 -m reflect.core setup if reflect is not yet on PATH).
    • Keep setup global/user-scoped by default:
      • do not install or merge repo-local hook files such as .github/hooks/otel-hooks.json unless the user explicitly asks for local/project instrumentation
      • use global agent config and global skill directories by default
      • if local/project instrumentation is requested, use the explicit reflect setup --agent <name> --local-agent <name> path
    • Do not block analysis on setup. If hooks cannot be initialized, continue with the telemetry that already exists and explain the limitation.
    • When setup succeeds, tell the user that only new agent runs will emit the newly initialized traces.
  2. Baseline from local telemetry

    • Prefer OTLP JSON traces such as ~/.reflect/state/otlp/otel-traces.json.
    • Use the existing reflect CLI or python3 src/reflect/core.py.
    • Remember the current CLI behavior:
      • default: open the local browser report from the SQLite store
      • reflect report is a deprecated alias for reflect
      • --terminal, --no-terminal, and --dashboard-artifact are deprecated compatibility outputs
    • If local traces are unavailable, fall back to legacy local state such as Cursor hook directories when present.
  3. Explain what local telemetry can prove

    • Separate confirmed facts from inference.
    • Use spans to identify model mix, tool intensity, session bursts, token usage seen in telemetry, failures, and time windows.
    • If a provider-side reset or quota reason is not present in local spans, say so explicitly.
    • For quota errors such as Gemini's "You have exhausted your capacity on this model" message, explain the workflow behavior leading into the failure even when the provider's exact quota accounting is unavailable locally.
  4. Optional provider enrichment

    • Only attempt this when the user asks about provider limits, reset windows, quota, spend, or reconciliation.
    • Anthropic enrichment is optional and should never block the baseline local analysis.
    • Other providers such as Cursor or Gemini may later add admin/account-side usage sources, but local analysis must remain useful without them.
    • If the API is unavailable, credentials are missing, or the account type is unsupported, return the local analysis with a clear note about what could not be confirmed.
  5. Merge with provenance

    • Label conclusions as:
      • Local telemetry
      • Provider API / export / dashboard
      • Inference
    • Never blur these categories.

Optional central / gateway workflow

The default product story is still local-first:

  • reflect setup
  • traces and sessions land in ~/.reflect/state/
  • reflect analyzes them locally

But when the user asks for a shared, team, or centralized setup, you may recommend a gateway-backed workflow.

When to recommend it

Use the gateway path when the user needs:

  • a central OTLP receiver for multiple developers or agents
  • shared routing, batching, enrichment, or backend export
  • a team-controlled ingestion layer instead of per-machine-only collection
  • a path toward hosted or organizational observability without changing reflect as the analysis UX

How to frame it

  • reflect stays the analysis/reporting surface
  • gateway becomes the central ingestion path
  • local storage in ~/.reflect/state/ remains the simplest default for individuals

Preferred central pattern

  1. AI tools emit OTLP directly or via hooks.
  2. A central gateway receives OTLP.
  3. The gateway enriches, batches, and forwards to the remote backend.
  4. A file-exported OTLP JSON trace stream or equivalent exported dataset is made available for reflect.
  5. reflect analyzes that exported trace data with --otlp-traces.

Guardrails

  • Do not make gateway sound mandatory for the quickstart.
  • Do not replace the local-first recommendation for individual users.
  • Do not claim the current reflect CLI depends on gateway.
  • Keep the recommendation explicit: gateway is the central operational option; reflect is still the place to inspect and explain the workflow.

Anthropic enrichment rules

When to use it

Use Anthropic enrichment when the user asks:

  • why Claude Code says "You've hit your limit"
  • when a limit resets
  • whether usage came from API vs subscription quota
  • whether local spans match Anthropic's own accounting

Credentials and prerequisites

  • Requires an Anthropic Admin API key (sk-ant-admin...), not a standard model API key.
  • The Admin API is unavailable for individual accounts.
  • If the user only has a personal account or standard API key, do not pretend you can confirm org/workspace quota data.

Preferred Anthropic endpoints

If network access and an Admin API key are available, prefer these endpoints:

  1. Claude Code Analytics API

    • Endpoint: /v1/organizations/usage_report/claude_code
    • Best for Claude Code usage, per-user daily metrics, token usage by model, estimated cost, customer type, and productivity metrics.
    • Use this first when the question is specifically about Claude Code.
  2. Usage & Cost Admin API

    • Endpoint: /v1/organizations/usage_report/messages
    • Useful for org/workspace API token usage across Anthropic services.
    • Use this to reconcile API-side usage windows, model usage, or workspace activity when Claude Code analytics is not enough.

Important interpretation guardrails

  • Anthropic API rate limits are token-bucket based and replenish continuously; they are not a fixed "reset at 7pm" style mechanism by default.
  • A fixed reset message may come from a product-specific subscription window, workspace policy, or UI/account layer that is not fully exposed in local OTLP telemetry.
  • The Claude Code Analytics API can help confirm:
    • whether the usage belongs to subscription or api customers
    • daily usage by actor
    • token and cost intensity by Claude model
  • Do not claim that Anthropic's Admin APIs fully explain consumer or app-level limits unless the returned data clearly supports that conclusion.

Cursor enrichment rules

When to use it

Use Cursor enrichment when the user asks:

  • why Cursor token counts are missing or zero
  • whether otel-hooks can read state.vscdb
  • how to connect reflect to Cursor usage
  • whether local Cursor behavior can be reconciled with provider-side usage

What local telemetry can prove

  • Local OTLP spans and Cursor transcripts are still useful for workflow analysis: timing, prompts, responses, tools, MCP activity, and failures.
  • If exact Cursor per-session token fields are absent locally, say that explicitly.
  • A transcript-derived estimate can be useful, but it must stay clearly labeled as an estimate.

Important Cursor guardrails

  • otel-hooks instruments the wrapped process boundary; it does not automatically read Cursor state.vscdb.
  • Treat state.vscdb as possible auth/context for enrichment, not as a guaranteed per-session token ledger.
  • Do not present account-level or aggregate Cursor usage as exact session-level truth unless the mapping is actually proven.

Preferred Cursor sources

If stronger Cursor usage visibility exists in the user's environment, prefer these in order:

  1. Local telemetry

    • OTLP spans, session adapters, transcripts
    • best for session chronology and workflow diagnosis
  2. Cursor local state

    • use only when needed for auth/context discovery
    • do not assume it contains exact per-session token totals
  3. Cursor provider usage / dashboard / export

    • use when the user has access to provider-side usage reporting
    • treat as provider evidence, not a substitute for local session flow

GitHub Copilot enrichment rules

When to use it

Use GitHub Copilot enrichment when the user asks:

  • whether local Copilot behavior matches GitHub's own usage reporting
  • how adoption, engagement, or usage changed across users, orgs, or enterprises
  • which IDEs, languages, models, or features are most used
  • whether license state or seat assignment explains missing usage

Credentials and prerequisites

  • Typically requires organization or enterprise access, not just an individual user account
  • May require Copilot metrics permissions, enterprise/org admin rights, or access to exported usage reports
  • If those permissions are unavailable, do not pretend local session telemetry can prove org-wide usage

Preferred Copilot sources

If provider-side access exists, prefer these sources:

  1. GitHub Copilot usage metrics API / exports

    • best for enterprise/org reporting
    • often centered on usage-metrics endpoints and NDJSON report downloads
    • useful for adoption, engagement, IDE/model/language usage, and feature-level activity
  2. GitHub Copilot usage dashboards

    • useful when an admin can view organization/enterprise metrics in the GitHub UI
  3. GitHub Copilot user management API

    • useful for seat/license assignment context
    • complements usage metrics but does not replace them

Important interpretation guardrails

  • Treat GitHub-reported usage and local OTLP telemetry as complementary, not interchangeable
  • Use local telemetry for single-session behavior and workflow diagnosis
  • Use GitHub-side metrics for cross-user, cross-team, or org-wide reporting
  • Do not claim org-wide Copilot usage visibility from a single machine's local telemetry

Gemini / Google enrichment rules

When to use it

Use Gemini enrichment when the user asks:

  • why Gemini reported exhausted capacity or quota
  • when a quota resets
  • whether the issue is per-minute, per-day, or project-level quota
  • whether a quota increase or tier change is the right next step

Credentials and prerequisites

  • Provider-side visibility may come from Google AI Studio, Google Cloud Console, or project-level quota dashboards
  • Some environments may expose Cloud Logging / Monitoring data for Gemini-related usage and errors
  • If the environment only has local telemetry and no provider-side Google visibility, do not pretend exact quota state can be confirmed

Preferred Gemini sources

If provider-side access exists, prefer these sources:

  1. Google AI Studio quota / usage views

    • useful for current usage, limits, and quota state
  2. Google Cloud Console quota views

    • useful for project-level limits, quota increase workflows, and service-level quota context
  3. Cloud Logging / Monitoring

    • useful when organizations already export Gemini-related usage and error signals into Google Cloud observability

Important interpretation guardrails

  • Local telemetry may explain why the workflow was expensive, retry-heavy, or failure-prone
  • Google-side quota views may explain why the next request was blocked
  • Do not claim there is a stable, public Gemini admin reporting API unless the user actually has one in their environment
  • Be explicit when reset timing comes from provider UI/dashboard context rather than a structured API response

Response contract

When you answer, structure the result like this:

  1. Answer first

    • State the most likely explanation in plain language.
  2. Evidence

    • Summarize the strongest local telemetry findings.
    • Add provider-enriched findings if available.
  3. Uncertainty

    • Call out what remains unverified.
  4. Next actions

    • Suggest the best next check, such as reducing context growth, checking provider quota dashboards, exporting admin metrics, or providing the right provider credentials for enrichment.

Future provider expansion

The skill should be written so additional provider enrichments can be added later:

  • Cursor: local exports, state.vscdb-backed auth/context discovery, team analytics, or provider APIs if they become available
  • GitHub Copilot: GitHub-side usage/account APIs or telemetry exports when available
  • Gemini CLI / Google: quota, usage, or model-capacity reporting endpoints/exports if they become available

Do not invent capabilities that the current environment does not actually expose. Keep the flow local-first and architecture-ready.

Install via CLI
npx skills add https://github.com/o11y-dev/reflect --skill reflect
Repository Details
star Stars 8
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator