ops-eng-o2r-grafana

star 1

Use when working with the Tempo + Grafana backend for otel-a2a-relay (o2r). Names the specific Grafana surface for each question (overview vs LUCA-flow vs Tempo Explore vs service graph), with direct dashboard URLs, TraceQL snippets, and the dockerized stack lifecycle. Triggers - grafana, tempo, o2r grafana, o2r tempo, o2r-overview, luca-flow dashboard, traceql, service graph, tempo explore, span metrics, o2r-tempo-harness, tempo-up.

coilyco-flight-deck By coilyco-flight-deck schedule Updated 6/10/2026

name: "ops-eng-o2r-grafana" description: "Use when working with the Tempo + Grafana backend for otel-a2a-relay (o2r). Names the specific Grafana surface for each question (overview vs LUCA-flow vs Tempo Explore vs service graph), with direct dashboard URLs, TraceQL snippets, and the dockerized stack lifecycle. Triggers - grafana, tempo, o2r grafana, o2r tempo, o2r-overview, luca-flow dashboard, traceql, service graph, tempo explore, span metrics, o2r-tempo-harness, tempo-up."

Tempo + Grafana backend for o2r

Local stack runs at http://localhost:3000 (Grafana). Tempo accepts OTLP/HTTP on :4318 and serves the query API on :3200. Prometheus is on :9090 for span-metrics and service-graph.

Full reference - dashboards, TraceQL cheatsheet, endpoint map, config knobs, known quirks - lives in tempo_grafana/README.md. This skill is the operator's quick index.

Which Grafana surface answers which question

Default to the provisioned dashboards. Tempo Explore is for ad-hoc TraceQL.

  • "Overall health" - o2r-overview dashboard (Grafana home): http://localhost:3000/d/o2r-overview/o2r-overview. Topology, span rate, p95 latency, error rate, recent error traces.
  • "One session's flow" - LUCA-flow dashboard: http://localhost:3000/d/luca-flow/luca-flow. session_id variable at the top; waterfall, per-step latency, acceptance tally.
  • "Visual waterfall for one trace" - click any trace row, or paste a trace ID in Tempo Explore; Grafana renders the timeline on the trace detail panel.
  • "Agent topology" - service-graph node view in Tempo Explore (queryType=serviceMap).
  • "Write TraceQL" - Tempo Explore: http://localhost:3000/explore. Cheatsheet panel lives in o2r-overview.
  • "Write PromQL on span metrics" - Prometheus UI at http://localhost:9090, or Grafana Explore with the Prometheus datasource.

Operate

From the otel-a2a-relay/ workspace root:

make tempo-up                                              # docker compose stack
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 make luca-demo
open http://localhost:3000                                 # lands on o2r-overview
make tempo-down                                            # stop, preserve volumes
make tempo-clean                                           # stop + wipe volumes

o2r-tempo-harness posts the worked-example trace, waits for Tempo to index it, and prints a Grafana Explore deep-link - use it to confirm the stack came up clean.

Common traps

  • Wrong port. Tempo is :4318, Phoenix is :6006 (same OTLP/HTTP protocol, different backend). Check OTEL_EXPORTER_OTLP_ENDPOINT if spans don't appear.
  • Empty service graph. Needs CLIENT/SERVER span-kind pairs within the 60s wait window; too-short runs leave it blank.
  • Dashboard edits gone after restart. Anonymous Grafana session edits don't persist - commit dashboard JSON back to grafana/dashboards/.
  • No metrics in Prometheus. Tempo's metrics_generator must reach Prometheus via remote_write on :9090/api/v1/write.
  • Volumes wiped. make tempo-clean removes the volume; use make tempo-down to preserve trace history.

See also

Install via CLI
npx skills add https://github.com/coilyco-flight-deck/otel-a2a-relay --skill ops-eng-o2r-grafana
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
coilyco-flight-deck
coilyco-flight-deck Explore all skills →