name: datadog-cli description: | Datadog CLI for debugging and triaging. Use this skill when you need to: search Datadog logs, query metrics, tail logs in real-time, trace distributed requests, investigate errors, compare time periods, find log patterns, check service health, or export observability data. Trigger phrases include "search logs", "tail logs", "query metrics", "check Datadog", "find errors", "trace request", "compare errors", "what services exist", "log patterns", "CPU usage", "service health".
Datadog CLI Reference
A CLI tool for AI agents to debug and triage using Datadog logs, metrics, and APM traces.
Setup
Running the CLI
# Via npx (no install needed)
npx @ctdio/datadog-cli <command>
# Via bunx
bunx @ctdio/datadog-cli <command>
# Or create an alias for convenience
alias datadog="npx @ctdio/datadog-cli"
Environment Variables (Required)
export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"
Get keys from: https://app.datadoghq.com/organization-settings/api-keys
For Non-US Datadog Sites
Use --site flag:
npx @ctdio/datadog-cli logs search --query "*" --site datadoghq.eu
Commands
Log Search
datadog logs search --query "<query>" [--from <time>] [--to <time>] [--limit <n>] [--sort <order>]
Examples:
datadog logs search --query "status:error" --from 1h
datadog logs search --query "service:api status:error @http.status_code:500" --from 1h
Live Tail (Real-time Streaming)
Stream logs as they arrive. Press Ctrl+C to stop.
datadog logs tail --query "<query>" [--interval <seconds>]
Examples:
datadog logs tail --query "status:error"
datadog logs tail --query "service:api" --interval 5
Trace Correlation
Find all logs for a distributed trace across services.
datadog logs trace --id "<trace-id>" [--from <time>] [--to <time>]
Example:
datadog logs trace --id "abc123def456" --from 24h
Log Context
Get logs before and after a specific timestamp to understand what happened.
datadog logs context --timestamp "<iso-timestamp>" [--before <time>] [--after <time>] [--service <svc>]
Examples:
datadog logs context --timestamp "2024-01-15T10:30:00Z" --before 5m --after 2m
datadog logs context --timestamp "2024-01-15T10:30:00Z" --service api --before 10m
Error Summary
Quick breakdown of errors by service, type, and message.
datadog errors [--from <time>] [--to <time>] [--service <svc>]
Examples:
datadog errors --from 1h
datadog errors --service payment-api --from 24h
Period Comparison
Compare log counts between current period and previous period.
datadog logs compare --query "<query>" --period <time>
Examples:
datadog logs compare --query "status:error" --period 1h
datadog logs compare --query "service:api status:error" --period 6h
Log Patterns
Group similar log messages to find patterns (replaces UUIDs, numbers, etc.).
datadog logs patterns --query "<query>" [--from <time>] [--limit <n>]
Examples:
datadog logs patterns --query "status:error" --from 1h
datadog logs patterns --query "service:api" --from 6h --limit 1000
Service Discovery
List all services with recent log activity.
datadog services [--from <time>] [--to <time>]
Example:
datadog services --from 24h
Log Aggregation
datadog logs agg --query "<query>" --facet <facet> [--from <time>]
Common facets: status, service, host, @http.status_code, @error.kind
Examples:
datadog logs agg --query "*" --facet status --from 1h
datadog logs agg --query "status:error" --facet service --from 24h
Multiple Queries
Run multiple queries in parallel.
datadog logs multi --queries "name1:query1,name2:query2" [--from <time>]
Example:
datadog logs multi --queries "errors:status:error,warnings:status:warn" --from 1h
Metrics Query
datadog metrics query --query "<metrics-query>" [--from <time>] [--to <time>]
Query format: <aggregation>:<metric>{<tags>}
Examples:
datadog metrics query --query "avg:system.cpu.user{*}" --from 1h
datadog metrics query --query "avg:system.cpu.user{service:api}" --from 1h
datadog metrics query --query "sum:trace.http.request.errors{service:api}.as_count()" --from 1h
APM Traces & Spans
Span Search
Search APM spans/traces with query filters.
datadog spans search --query "<query>" [--from <time>] [--to <time>] [--limit <n>] [--min-duration <duration>]
Examples:
datadog spans search --query "env:prod service:api" --from 1h
datadog spans search --query "service:api" --min-duration 1s --from 1h # Slow requests
datadog spans search --query "resource_name:POST" --from 1h
Span Aggregation
Aggregate spans by facet.
datadog spans agg --query "<query>" --facet <facet> [--from <time>]
Common facets: service, resource_name, status, env, operation_name
Examples:
datadog spans agg --query "env:prod" --facet service --from 24h
datadog spans agg --query "service:api" --facet resource_name --from 1h
Trace Hierarchy View
View all spans in a trace with parent-child relationships.
datadog spans trace --id "<trace-id>" [--from <time>]
Example:
datadog spans trace --id "abc123def456" --from 24h
Span Error Summary
Get breakdown of span errors by service and resource.
datadog spans errors [--from <time>] [--service <svc>]
Examples:
datadog spans errors --from 1h
datadog spans errors --service api --from 24h
APM Service Discovery
List all services with APM trace activity.
datadog spans services [--from <time>]
Example:
datadog spans services --from 24h
Global Flags
| Flag | Description |
|---|---|
--pretty |
Human-readable output with colors |
--output <file> |
Export results to JSON file |
--site <site> |
Datadog site (e.g., datadoghq.eu) |
Time Formats
- Relative:
30m,1h,6h,24h,7d - ISO 8601:
2024-01-15T10:30:00Z
Common Workflows
Incident Triage
# 1. Quick error overview (logs and spans)
datadog errors --from 1h
datadog spans errors --from 1h
# 2. Is this new? Compare to previous period
datadog logs compare --query "status:error" --period 1h
# 3. What patterns are we seeing?
datadog logs patterns --query "status:error" --from 1h
# 4. Narrow down by service
datadog logs search --query "status:error service:payment-api" --from 1h
# 5. Find slow requests
datadog spans search --query "service:api" --min-duration 1s --from 1h
# 6. Get context around a specific timestamp
datadog logs context --timestamp "2024-01-15T10:30:00Z" --service api --before 5m --after 2m
# 7. Follow the distributed trace (logs and spans)
datadog logs trace --id "TRACE_ID"
datadog spans trace --id "TRACE_ID"
Real-time Debugging
# Stream errors as they happen
datadog logs tail --query "status:error"
# Watch specific service
datadog logs tail --query "service:api status:error"
Service Health Check
# List services
datadog services --from 24h
# Check error distribution
datadog logs agg --query "service:api" --facet status --from 1h
# Check CPU/memory
datadog metrics query --query "avg:system.cpu.user{service:api}" --from 1h
Export for Sharing
# Save search results
datadog logs search --query "status:error" --from 1h --output errors.json
# Save error summary
datadog errors --from 24h --output error-report.json
Datadog Query Syntax
| Operator | Example | Description |
|---|---|---|
AND |
service:api status:error |
Both conditions |
OR |
status:error OR status:warn |
Either condition |
- |
-status:info |
Exclude |
* |
service:api-* |
Wildcard |
>= <= |
@http.status_code:>=400 |
Numeric comparison |
[TO] |
@duration:[1000 TO 5000] |
Range |
Common Attributes
service- Service namestatus- Log level (error, warn, info, debug)host- Hostname@http.status_code- HTTP status code@error.kind- Error type@trace_id/@dd.trace_id- Trace ID