name: "ops-eng-o2r-grafana" description: "Use when working with the Tempo + Grafana backend for otel-a2a-relay (o2r). Names the specific Grafana surface for each question (overview vs LUCA-flow vs Tempo Explore vs service graph), with direct dashboard URLs, TraceQL snippets, and the dockerized stack lifecycle. Triggers - grafana, tempo, o2r grafana, o2r tempo, o2r-overview, luca-flow dashboard, traceql, service graph, tempo explore, span metrics, o2r-tempo-harness, tempo-up."
Tempo + Grafana backend for o2r
Local stack runs at http://localhost:3000 (Grafana). Tempo accepts OTLP/HTTP on :4318 and serves the query API on :3200. Prometheus is on :9090 for span-metrics and service-graph.
Full reference - dashboards, TraceQL cheatsheet, endpoint map, config knobs, known quirks - lives in tempo_grafana/README.md. This skill is the operator's quick index.
Which Grafana surface answers which question
Default to the provisioned dashboards. Tempo Explore is for ad-hoc TraceQL.
- "Overall health" - o2r-overview dashboard (Grafana home): http://localhost:3000/d/o2r-overview/o2r-overview. Topology, span rate, p95 latency, error rate, recent error traces.
- "One session's flow" - LUCA-flow dashboard: http://localhost:3000/d/luca-flow/luca-flow.
session_idvariable at the top; waterfall, per-step latency, acceptance tally. - "Visual waterfall for one trace" - click any trace row, or paste a trace ID in Tempo Explore; Grafana renders the timeline on the trace detail panel.
- "Agent topology" - service-graph node view in Tempo Explore (
queryType=serviceMap). - "Write TraceQL" - Tempo Explore: http://localhost:3000/explore. Cheatsheet panel lives in o2r-overview.
- "Write PromQL on span metrics" - Prometheus UI at http://localhost:9090, or Grafana Explore with the Prometheus datasource.
Operate
From the otel-a2a-relay/ workspace root:
make tempo-up # docker compose stack
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 make luca-demo
open http://localhost:3000 # lands on o2r-overview
make tempo-down # stop, preserve volumes
make tempo-clean # stop + wipe volumes
o2r-tempo-harness posts the worked-example trace, waits for Tempo to index it, and prints a Grafana Explore deep-link - use it to confirm the stack came up clean.
Common traps
- Wrong port. Tempo is
:4318, Phoenix is:6006(same OTLP/HTTP protocol, different backend). CheckOTEL_EXPORTER_OTLP_ENDPOINTif spans don't appear. - Empty service graph. Needs CLIENT/SERVER span-kind pairs within the 60s wait window; too-short runs leave it blank.
- Dashboard edits gone after restart. Anonymous Grafana session edits don't persist - commit dashboard JSON back to
grafana/dashboards/. - No metrics in Prometheus. Tempo's
metrics_generatormust reach Prometheus via remote_write on:9090/api/v1/write. - Volumes wiped.
make tempo-cleanremoves the volume; usemake tempo-downto preserve trace history.
See also
ops-eng-o2r-phoenix- the Phoenix backend for the same spans.tempo_grafana/README.md- full dashboard, TraceQL, and config reference.