metrics

star 25

Query metrics from Prometheus using PromQL for HTTP request rates, latency percentiles, error rates, active connections, and GenAI token usage.

opensearch-project By opensearch-project schedule Updated 3/22/2026

name: metrics description: Query metrics from Prometheus using PromQL for HTTP request rates, latency percentiles, error rates, active connections, and GenAI token usage. allowed-tools: - Bash - curl

Metrics Querying with PromQL

Overview

This skill provides PromQL query templates for querying metrics from Prometheus. All queries use the Prometheus HTTP API at http://localhost:9090/api/v1/query. No authentication is needed for local Prometheus.

Prometheus runs on port 9090 using HTTP (not HTTPS).

Connection Defaults

Variable Default Description
PROMETHEUS_ENDPOINT http://localhost:9090 Prometheus base URL

Metric Discovery

Different OTel SDK versions and languages emit HTTP metrics under different names. Before querying, discover which metric names are active in your stack:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/label/__name__/values" | python3 -c "
import json, sys
for m in json.load(sys.stdin).get('data', []):
    if any(k in m for k in ['http_server', 'gen_ai', 'db_client']):
        print(m)"

Common HTTP metric name variants:

Metric Name Unit Emitted By
http_server_duration_milliseconds milliseconds Python OTel SDK (older semconv)
http_server_duration_seconds seconds .NET, Java OTel SDKs
http_server_request_duration_seconds seconds Stable HTTP semconv (newer SDKs)

Important: Replace the metric name in the queries below with whichever variant is active in your stack. The query patterns (rate, histogram_quantile, etc.) are identical — only the metric name changes. For histogram bucket queries, replace _seconds_bucket with _milliseconds_bucket as appropriate, and adjust latency thresholds accordingly (e.g., le="0.25" for seconds vs le="250" for milliseconds).

HTTP Request Rate by Service

Calculate the per-second HTTP request rate over a 5-minute window, grouped by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'

HTTP Latency Percentiles

p95 Latency by Service

Calculate the 95th percentile HTTP request latency by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=histogram_quantile(0.95, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'

p99 Latency by Service

Calculate the 99th percentile HTTP request latency by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=histogram_quantile(0.99, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'

Error Rate (5xx Responses)

Calculate the ratio of 5xx error responses to total requests by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=sum(rate(http_server_duration_seconds_count{http_response_status_code=~"5.."}[5m])) by (service_name) / sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'

Note on status code labels: The label name varies by OTel SDK version. Older semconv uses http_status_code; newer stable semconv uses http_response_status_code. Use the Metric Discovery section to check which label is present, or query both:

sum(rate(http_server_duration_seconds_count{http_status_code=~"5.."}[5m])) by (service_name)

Active Connections

Query the current number of active HTTP connections by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=sum(http_server_active_requests) by (service_name)'

Database Operation Latency

DB Operation p95 Latency by Service

Calculate the 95th percentile database operation latency by service:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=histogram_quantile(0.95, sum(rate(db_client_operation_duration_seconds_bucket[5m])) by (le, service_name))'

GenAI-Specific Metrics

Token Usage by Operation and Model

Query GenAI token usage histograms grouped by operation name and request model:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'

Token usage p95 by operation and model:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'

Operation Duration by Operation and Model

Query GenAI operation duration histograms grouped by operation and model:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'

Operation duration p95 by operation and model:

curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
  --data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'

Available Metric Names and Label Dimensions

Metric Type Labels
http_server_duration_milliseconds histogram service_name, http_response_status_code
http_server_duration_seconds histogram service_name, http_response_status_code
http_server_request_duration_seconds histogram service_name, http_response_status_code
http_server_active_requests gauge service_name
db_client_operation_duration_seconds histogram service_name
gen_ai_client_token_usage histogram gen_ai.operation.name, gen_ai.request.model
gen_ai_client_operation_duration_seconds histogram gen_ai.operation.name, gen_ai.request.model

Note on Prometheus label names: Prometheus replaces dots in label names with underscores. The OTel attribute gen_ai.operation.name becomes the Prometheus label gen_ai_operation_name in PromQL queries. The table above shows the original OTel attribute names for reference.

PPL Alternative for OpenSearch-Ingested Metrics

PPL can also query metrics stored in OpenSearch when metrics are ingested via Data Prepper, as an alternative to PromQL. This is useful for OpenSearch-native workflows where you want to query metrics alongside traces and logs using a single query language. When Data Prepper is configured to ingest metrics into OpenSearch, you can use PPL source= queries against the metrics index just as you would for traces and logs.

References

AWS Managed Service for Prometheus

To query metrics on Amazon Managed Service for Prometheus (AMP), replace the local endpoint and add AWS SigV4 authentication:

curl -s --aws-sigv4 "aws:amz:REGION:aps" \
  --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
  'https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query' \
  --data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
  • Endpoint format: https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query
  • Auth: --aws-sigv4 "aws:amz:REGION:aps" with --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY"
  • PromQL query syntax is identical between local Prometheus and Amazon Managed Prometheus; only the endpoint and authentication differ
Install via CLI
npx skills add https://github.com/opensearch-project/observability-stack --skill metrics
Repository Details
star Stars 25
call_split Forks 27
navigation Branch main
article Path SKILL.md
More from Creator
opensearch-project
opensearch-project Explore all skills →