name: ppl-reference description: Comprehensive PPL (Piped Processing Language) reference for OpenSearch with command syntax, functions, and examples for observability queries. allowed-tools: - Bash - curl
PPL Language Reference
Overview
This is a comprehensive reference for the Piped Processing Language (PPL) used by OpenSearch. PPL queries follow a pipe-delimited syntax starting with source=<index> and chaining commands with |. This reference covers all commands, functions, API endpoints, and usage patterns needed to construct observability queries against trace and log indices.
Grammar sourced from the opensearch-project/sql repository's docs/user/ppl/ directory:
https://github.com/opensearch-project/sql
Connection Defaults
| Variable | Default | Description |
|---|---|---|
OPENSEARCH_ENDPOINT |
https://localhost:9200 |
OpenSearch base URL |
OPENSEARCH_USER |
admin |
OpenSearch username |
OPENSEARCH_PASSWORD |
My_password_123!@# |
OpenSearch password |
Field Name Escaping
Field names containing dots must be enclosed in backticks to avoid parsing errors:
`attributes.gen_ai.operation.name`
`attributes.gen_ai.usage.input_tokens`
`status.code`
`events.attributes.exception.type`
`@timestamp`
This is critical for OTel attribute fields which use dotted naming conventions.
API Endpoints
Query Endpoint
Execute a PPL query:
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by serviceName"}'
Request body: {"query": "<ppl_query>"}
Explain Endpoint
Retrieve the query execution plan (useful for debugging and profiling):
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
Commands
Core Query Commands
search / source
Start a query by specifying the data source index pattern.
Syntax: search source=<index-pattern> or source=<index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 10"}'
where
Filter results based on a condition.
Syntax: where <condition>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | head 10"}'
Supports: =, !=, <, >, <=, >=, AND, OR, NOT, LIKE, IN, BETWEEN, IS NULL, IS NOT NULL.
fields
Select specific fields to return.
Syntax: fields [+|-] <field-list>
Use + to include or - to exclude fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, spanId, serviceName, durationInNanos | head 10"}'
stats
Aggregate data using statistical functions.
Syntax: stats <aggregation>... [by <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by serviceName"}'
Supports: count(), sum(), avg(), max(), min(), var_samp(), var_pop(), stddev_samp(), stddev_pop(), distinct_count(), percentile(), earliest(), latest(), list(), values(), first(), last().
sort
Order results by one or more fields.
Syntax: sort [+|-] <field> [, ...]
Use + for ascending (default), - for descending.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort - durationInNanos | head 10"}'
head
Limit the number of results returned.
Syntax: head [N] (default N=10)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 5"}'
eval
Compute new fields from expressions.
Syntax: eval <new-field> = <expression> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
dedup
Remove duplicate results based on field values.
Syntax: dedup [N] <field-list> [keepempty=<bool>] [consecutive=<bool>]
Caveat:
dedupmay throw a ClassCastException on fields with mixed types (e.g., a field that contains both strings and numbers across documents). Ensure the dedup field has a consistent type.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | dedup serviceName | fields serviceName"}'
rename
Rename one or more fields.
Syntax: rename <old-field> AS <new-field> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rename serviceName as service, durationInNanos as duration | fields traceId, service, duration | head 10"}'
top
Find the most frequent values for a field.
Syntax: top [N] <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | top 5 serviceName"}'
rare
Find the least frequent values for a field.
Syntax: rare <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rare `attributes.gen_ai.operation.name`"}'
table
Display results in tabular format (alias for fields in some contexts).
Syntax: table <field-list>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | table traceId, spanId, serviceName, name | head 10"}'
Time-Series Commands
timechart
Aggregate data into time buckets for time-series visualization.
Syntax: timechart span=<interval> <aggregation>... [by <field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m count() as span_count by serviceName"}'
Rate functions for timechart: per_second(), per_minute(), per_hour(), per_day() — compute rate of an aggregation per time unit.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=1h per_minute(count()) as spans_per_min by serviceName"}'
chart
General charting command for aggregation over arbitrary fields.
Syntax: chart <aggregation>... by <field> [span(<field>, <interval>)]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | chart avg(durationInNanos) by serviceName"}'
bin
Bucket numeric or date values into intervals.
Syntax: eval <new-field> = bin(<field>, <interval>) or used within stats ... by span(<field>, <interval>)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by span(durationInNanos, 1000000000)"}'
trendline
Calculate moving averages over sorted data.
Syntax: trendline [sort <field>] sma(<period>, <field>) [as <alias>]
SMA = Simple Moving Average.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | trendline sort startTime sma(10, durationInNanos) as avg_duration | fields startTime, durationInNanos, avg_duration | head 50"}'
streamstats
Compute running (cumulative) statistics over ordered results.
Syntax: streamstats <aggregation>... [by <field>]
Caveat:
streamstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforestreamstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | streamstats count() as running_count, sum(`attributes.gen_ai.usage.input_tokens`) as cumulative_tokens | fields startTime, running_count, cumulative_tokens | head 50"}'
eventstats
Add aggregation results as new fields to each row without collapsing rows (unlike stats).
Syntax: eventstats <aggregation>... [by <field>]
Caveat:
eventstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforeeventstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eventstats avg(durationInNanos) as avg_svc_duration by serviceName | eval deviation = durationInNanos - avg_svc_duration | fields traceId, serviceName, durationInNanos, avg_svc_duration, deviation | sort - deviation | head 20"}'
Parse/Extract Commands
parse
Extract fields from text using a regular expression with named groups.
Syntax: parse <field> '<regex-with-named-groups>'
Caveat:
parsemay silently drop extracted fields on some OpenSearch versions. If extracted fields are missing from results, usegrokorrexas more reliable alternatives.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | parse body '\''(?P<level>\\w+): (?P<msg>.+)'\'' | fields level, msg | head 10"}'
grok
Extract fields using Grok patterns (predefined regex patterns).
Syntax: grok <field> '<grok-pattern>'
Caveat:
grokprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforegrokto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | grok body '\''%{LOGLEVEL:level} %{GREEDYDATA:message}'\'' | fields level, message | head 10"}'
rex
Extract fields using named capture groups (similar to parse but with Splunk-compatible syntax).
Syntax: rex field=<field> '<regex-with-named-groups>'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | rex field=body '\''(?<statuscode>\\d{3})'\'' | fields statuscode, body | head 10"}'
regex
Filter results using a regular expression match on a field.
Syntax: <field> = regex '<pattern>' (used within where)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where body like '\''%error%'\'' | fields traceId, body, severityText | head 10"}'
patterns
Auto-discover log patterns by clustering similar log messages.
Syntax: patterns <field> [pattern='<regex>'] [new_field=<name>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | patterns body | fields body, patterns_field | head 20"}'
spath
Extract values from structured data (JSON, XML) using path expressions.
Syntax: spath input=<field> [path=<path>] [output=<field>]
Note: Verify the target field exists in your index before using
spath. Rundescribe <index-pattern>first to confirm the field name. The example below uses a representative field; adjust to match your actual schema.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.gen_ai.tool.name`) | spath input=`attributes.gen_ai.tool.name` | head 10"}'
Join/Lookup Commands
join
Join results from two indices.
Syntax: join left=<alias> right=<alias> ON <condition> <right-source> or join <right-source> on <field>
Types: inner, left, right, cross.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | join left=s right=l ON s.traceId = l.traceId logs-otel-v1-* | fields s.spanId, s.name, l.severityText, l.body | head 10"}'
lookup
Enrich results by looking up values from another index.
Syntax: lookup <lookup-index> <match-field> [AS <alias>] [OUTPUT <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | lookup otel-v2-apm-service-map serviceName AS `sourceNode` | fields serviceName, `targetNode`, durationInNanos | head 10"}'
Note: The service map index (
otel-v2-apm-service-map) uses nested fieldssourceNodeandtargetNode, notserviceName. Match accordingly when joining or looking up against this index.
graphlookup
Perform graph traversal lookups for hierarchical or connected data.
Syntax: graphlookup <index> connectFromField=<field> connectToField=<field> [maxDepth=<N>] [as <alias>]
Caveat:
graphlookuphas limited support in OpenSearch 3.x PPL and may not work as expected. Test carefully against your OpenSearch version before relying on this command in production queries.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v2-apm-service-map | graphlookup otel-v2-apm-service-map connectFromField=`destination.domain` connectToField=serviceName maxDepth=3 as dependencies | head 10"}'
subquery
Use a nested query as a data source or filter.
Syntax: where <field> IN [ source=<index> | ... | fields <field> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where traceId IN [ source=otel-v1-apm-span-* | where `status.code` = 2 | fields traceId ] | fields traceId, spanId, serviceName, name | head 20"}'
append
Append results from another query to the current result set.
Syntax: append [ source=<index> | ... ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | append [ source=logs-otel-v1-* | stats count() as cnt by `resource.attributes.service.name` ] | head 20"}'
appendcol
Append columns from another query to the current result set.
Syntax: appendcol [ <commands> ]
Caveat:
source=is not valid insideappendcol []subqueries. The subquery insideappendcoloperates on the current result set, not a new index. Useappendfollowed by reshaping if you need to bring in data from another index.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count | appendcol [ source=logs-otel-v1-* | stats count() as log_count ]"}'
appendpipe
Append the results of a sub-pipeline to the current results.
Syntax: appendpipe [ <commands> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | appendpipe [ stats sum(cnt) as total ]"}'
Transform Commands
fillnull
Replace null values with a specified value.
Syntax: fillnull [with <value>] [<field-list>]
Caveat: Backtick-quoted field names are not supported in the
fillnullfield list. Useevalto rename dotted fields to simple names before applyingfillnull, or applyfillnullwithout a field list to fill all null fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fillnull with 0 `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | fields traceId, `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | head 10"}'
flatten
Flatten nested fields into top-level fields.
Syntax: flatten <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | flatten events | head 10"}'
expand
Expand multi-value or array fields into separate rows.
Syntax: expand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | expand events | fields traceId, spanId, events | head 20"}'
transpose
Pivot rows into columns.
Syntax: transpose [<N>] [include_null=<bool>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | transpose"}'
convert
Convert field types (e.g., string to number).
Syntax: convert <function>(<field>) [as <alias>]
Functions: auto(), num(), ip(), ctime(), dur2sec(), mktime(), mstime(), rmcomma(), rmunit(), memk(), none().
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_str = CAST(durationInNanos AS STRING) | convert num(duration_str) as duration_num | fields traceId, duration_num | head 10"}'
replace
Replace values in a field using the replace() string function inside eval.
Syntax: eval <field> = replace(<field>, '<old>', '<new>')
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval severityText = replace(severityText, '\''ERROR'\'', '\''ERR'\'') | fields severityText, body | head 10"}'
reverse
Reverse the order of results.
Syntax: reverse
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | head 20 | reverse"}'
Multi-Value Commands
mvexpand
Expand a multi-value field into separate rows (one row per value).
Syntax: mvexpand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | mvexpand events | fields traceId, spanId, events | head 20"}'
mvcombine
Combine multiple rows with the same key into a single row with a multi-value field.
Syntax: mvcombine <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, serviceName | mvcombine serviceName | head 10"}'
nomv
Convert a multi-value field to a single-value field (takes the first value or joins with a delimiter).
Syntax: nomv <field>
Caveat:
nomvonly works on string arrays, not nested object arrays. If the field contains nested objects (e.g.,eventswith sub-fields),nomvwill fail or produce unexpected results. Useflattenorexpandfor nested object arrays instead.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | nomv events | fields traceId, events | head 10"}'
Aggregation/Totals Commands
addcoltotals
Add a summary row at the bottom with column totals.
Syntax: addcoltotals [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | addcoltotals"}'
addtotals
Add a new field to each row containing the sum of specified numeric fields.
Syntax: addtotals [row=<bool>] [col=<bool>] [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats sum(`attributes.gen_ai.usage.input_tokens`) as input_tok, sum(`attributes.gen_ai.usage.output_tokens`) as output_tok by serviceName | addtotals"}'
ML Commands
ad
Anomaly detection — identify anomalous values in numeric fields using built-in ML algorithms.
Syntax: ad [time_field=<field>] [number_of_trees=<N>] [shingle_size=<N>] [time_zone=<tz>]
Note: The
adcommand does not take a positional field argument. It auto-detects the input field(s) from the precedingstatsorevaloutput in the pipeline.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ad time_field=startTime number_of_trees=100 time_zone=\"UTC\" | head 50"}'
kmeans
Cluster data points using the k-means algorithm.
Syntax: kmeans [centroids=<N>] [iterations=<N>] [distance_type=<type>]
Note: The
kmeanscommand does not take positional field arguments. It operates on all numeric fields from the preceding pipeline output. Usefieldsorevalbeforekmeansto control which numeric fields are used for clustering.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | fields traceId, serviceName, durationInNanos | kmeans centroids=3 | fields traceId, serviceName, durationInNanos, ClusterID | head 30"}'
ml
General ML command for running machine learning algorithms on query results.
Syntax: ml action=<algorithm> [parameters...]
Supported algorithms include: kmeans, ad (anomaly detection).
Note:
ml action=rcfis not a valid action in OpenSearch 3.x PPL. Random Cut Forest anomaly detection is accessed via theadcommand directly (see theadsection above), not throughml action=rcf.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ml action=kmeans centroids=3 | head 50"}'
System Commands
describe
Inspect the index mapping and field types for an index.
Syntax: describe <index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "describe otel-v1-apm-span-*"}'
explain
Show the query execution plan (used via the _explain API endpoint rather than as an inline command).
Syntax: Use the /_plugins/_ppl/_explain endpoint with the query body.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
showdatasources
List all configured data sources available for PPL queries.
Syntax: show datasources
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "show datasources"}'
multisearch
Execute multiple PPL queries in a single request. Each query is independent.
Syntax: Use the /_plugins/_ppl endpoint with multiple queries separated by newlines (NDJSON format), or execute sequentially.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total_spans"}'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | stats count() as total_logs"}'
Display Commands
fieldformat
Format the display of a field's values without changing the underlying data.
Syntax: fieldformat <field> = <format-expression>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fieldformat duration_ms = CONCAT(CAST(duration_ms AS STRING), '\'' ms'\'') | fields traceId, serviceName, duration_ms | head 10"}'
Span Expression Syntax
The span() function buckets numeric or datetime values into intervals. Used with stats, timechart, and chart.
Syntax: span(<field>, <interval>)
Supported Time Units
| Unit | Description | Example |
|---|---|---|
ms |
Milliseconds | span(startTime, 500ms) |
s |
Seconds | span(startTime, 30s) |
m |
Minutes | span(startTime, 5m) |
h |
Hours | span(startTime, 1h) |
d |
Days | span(startTime, 1d) |
w |
Weeks | span(startTime, 1w) |
M |
Months | span(startTime, 1M) |
q |
Quarters | span(startTime, 1q) |
y |
Years | span(startTime, 1y) |
Numeric Spans
For numeric fields, the interval is a plain number:
stats count() by span(durationInNanos, 1000000000)
Example
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by span(startTime, 1h)"}'
Timechart Rate Functions
Rate functions normalize aggregation values to a per-time-unit rate within timechart:
| Function | Description |
|---|---|
per_second(<agg>) |
Aggregation value per second |
per_minute(<agg>) |
Aggregation value per minute |
per_hour(<agg>) |
Aggregation value per hour |
per_day(<agg>) |
Aggregation value per day |
Example
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m per_second(count()) as requests_per_sec"}'
Functions
Aggregation Functions
Used with stats, eventstats, streamstats, timechart, and chart commands.
| Function | Syntax | Description |
|---|---|---|
COUNT |
count() |
Count of events |
SUM |
sum(field) |
Sum of numeric values |
AVG |
avg(field) |
Arithmetic mean |
MAX |
max(field) |
Maximum value |
MIN |
min(field) |
Minimum value |
VAR_SAMP |
var_samp(field) |
Sample variance |
VAR_POP |
var_pop(field) |
Population variance |
STDDEV_SAMP |
stddev_samp(field) |
Sample standard deviation |
STDDEV_POP |
stddev_pop(field) |
Population standard deviation |
DISTINCT_COUNT |
distinct_count(field) |
Count of distinct values |
PERCENTILE |
percentile(field, pct) |
Value at the given percentile |
EARLIEST |
earliest(field) |
Earliest (first chronological) value |
LATEST |
latest(field) |
Latest (most recent) value |
LIST |
list(field) |
All values as a list |
VALUES |
values(field) |
Distinct values as a list |
FIRST |
first(field) |
First value encountered |
LAST |
last(field) |
Last value encountered |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total, avg(durationInNanos) as avg_ns, percentile(durationInNanos, 95) as p95_ns, distinct_count(serviceName) as services"}'
Collection Functions
Functions for working with multi-value fields and arrays.
| Function | Syntax | Description |
|---|---|---|
ARRAY |
array(val1, val2, ...) |
Create an array from values |
SPLIT |
split(field, delimiter) |
Split a string into an array |
MVJOIN |
mvjoin(field, delimiter) |
Join multi-value field into a string |
MVCOUNT |
mvcount(field) |
Count of values in a multi-value field |
MVINDEX |
mvindex(field, index) |
Get value at index from multi-value field |
MVFIRST |
mvfirst(field) |
First value of a multi-value field |
MVLAST |
mvlast(field) |
Last value of a multi-value field |
MVAPPEND |
mvappend(field1, field2) |
Append two multi-value fields |
MVDEDUP |
mvdedup(field) |
Remove duplicates from multi-value field |
MVSORT |
mvsort(field) |
Sort values in a multi-value field |
MVZIP |
mvzip(field1, field2, delim) |
Zip two multi-value fields together |
MVRANGE |
mvrange(start, end, step) |
Generate a range of numeric values |
MVFILTER |
mvfilter(expression) |
Filter values in a multi-value field |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval tokens = array(`attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens`) | fields traceId, tokens | head 10"}'
Condition Functions
Functions for conditional logic and null handling.
| Function | Syntax | Description |
|---|---|---|
ISNULL |
isnull(field) |
Returns true if field is null |
ISNOTNULL |
isnotnull(field) |
Returns true if field is not null |
IF |
if(cond, true_val, false_val) |
Conditional expression |
IFNULL |
ifnull(field, default) |
Return default if field is null |
NULLIF |
nullif(val1, val2) |
Return null if val1 equals val2 |
CASE |
case(cond1, val1, cond2, val2, ..., else_val) |
Multi-branch conditional |
COALESCE |
coalesce(val1, val2, ...) |
First non-null value |
LIKE |
field LIKE 'pattern' |
Wildcard pattern match (% and _) |
IN |
field IN (val1, val2, ...) |
Check membership in a set |
BETWEEN |
field BETWEEN val1 AND val2 |
Range check (inclusive) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval status_label = case(`status.code` = 0, '\''UNSET'\'', `status.code` = 1, '\''OK'\'', `status.code` = 2, '\''ERROR'\'') | stats count() by status_label"}'
Conversion Functions
Functions for type casting and conversion.
| Function | Syntax | Description |
|---|---|---|
CAST |
cast(field AS type) |
Cast to a specified type (STRING, INT, LONG, FLOAT, DOUBLE, BOOLEAN, DATE, TIMESTAMP) |
TOSTRING |
tostring(field) |
Convert to string |
TONUMBER |
tonumber(field) |
Convert to number |
TOINT |
toint(field) |
Convert to integer |
TOLONG |
tolong(field) |
Convert to long |
TOFLOAT |
tofloat(field) |
Convert to float |
TODOUBLE |
todouble(field) |
Convert to double |
TOBOOLEAN |
toboolean(field) |
Convert to boolean |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = CAST(durationInNanos AS DOUBLE) / 1000000.0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Cryptographic Functions
Functions for computing hash digests.
| Function | Syntax | Description |
|---|---|---|
MD5 |
md5(field) |
MD5 hash of the value |
SHA1 |
sha1(field) |
SHA-1 hash of the value |
SHA2 |
sha2(field, numBits) |
SHA-2 hash (numBits: 224, 256, 384, 512) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval trace_hash = md5(traceId) | fields traceId, trace_hash | head 5"}'
Datetime Functions
Functions for date and time manipulation.
| Function | Syntax | Description |
|---|---|---|
NOW |
now() |
Current timestamp |
CURDATE |
curdate() |
Current date |
CURTIME |
curtime() |
Current time |
DATE_FORMAT |
date_format(date, fmt) |
Format a date (%Y-%m-%d %H:%i:%s) |
DATE_ADD |
date_add(date, INTERVAL n unit) |
Add interval to date |
DATE_SUB |
date_sub(date, INTERVAL n unit) |
Subtract interval from date |
DATEDIFF |
datediff(date1, date2) |
Difference in days between two dates |
DAY |
day(date) |
Day of month (1–31) |
MONTH |
month(date) |
Month (1–12) |
YEAR |
year(date) |
Year |
HOUR |
hour(time) |
Hour (0–23) |
MINUTE |
minute(time) |
Minute (0–59) |
SECOND |
second(time) |
Second (0–59) |
DAYOFWEEK |
dayofweek(date) |
Day of week (1=Sun, 7=Sat) |
DAYOFYEAR |
dayofyear(date) |
Day of year (1–366) |
WEEK |
week(date) |
Week number of the year |
UNIX_TIMESTAMP |
unix_timestamp(date) |
Convert to Unix epoch seconds |
FROM_UNIXTIME |
from_unixtime(epoch) |
Convert Unix epoch to timestamp |
TIMESTAMPADD |
timestampadd(unit, n, ts) |
Add interval to timestamp |
TIMESTAMPDIFF |
timestampdiff(unit, ts1, ts2) |
Difference between timestamps in given unit |
PERIOD_ADD |
period_add(period, n) |
Add months to a period (YYMM/YYYYMM) |
PERIOD_DIFF |
period_diff(p1, p2) |
Difference in months between periods |
MAKETIME |
maketime(h, m, s) |
Create a time value |
MAKEDATE |
makedate(year, dayofyear) |
Create a date from year and day-of-year |
ADDDATE |
adddate(date, INTERVAL n unit) |
Alias for DATE_ADD |
SUBDATE |
subdate(date, INTERVAL n unit) |
Alias for DATE_SUB |
SYSDATE |
sysdate() |
Current date and time (evaluated at execution) |
UTC_DATE |
utc_date() |
Current UTC date |
UTC_TIME |
utc_time() |
Current UTC time |
UTC_TIMESTAMP |
utc_timestamp() |
Current UTC timestamp |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where startTime > DATE_SUB(NOW(), INTERVAL 1 HOUR) | stats count() as recent_spans by serviceName"}'
Expressions
Operators for arithmetic, comparison, and logical expressions used in eval, where, and other commands.
Arithmetic Operators
| Operator | Description | Example |
|---|---|---|
+ |
Addition | eval total = input_tokens + output_tokens |
- |
Subtraction | eval gap = endTime - startTime |
* |
Multiplication | eval cost = tokens * price_per_token |
/ |
Division | eval duration_ms = durationInNanos / 1000000 |
Comparison Operators
| Operator | Description |
|---|---|
= |
Equal to |
!= or <> |
Not equal to |
< |
Less than |
> |
Greater than |
<= |
Less than or equal to |
>= |
Greater than or equal to |
Logical Operators
| Operator | Description |
|---|---|
AND |
Logical AND |
OR |
Logical OR |
NOT |
Logical NOT |
XOR |
Logical exclusive OR |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000, total_tokens = `attributes.gen_ai.usage.input_tokens` + `attributes.gen_ai.usage.output_tokens` | where duration_ms > 1000 AND total_tokens > 0 | fields traceId, serviceName, duration_ms, total_tokens | head 10"}'
IP Functions
Functions for IP address operations.
| Function | Syntax | Description |
|---|---|---|
CIDRMATCH |
cidrmatch(ip_field, 'cidr') |
Check if IP is within a CIDR range |
GEOIP |
geoip(ip_field) |
Geo-locate an IP address (returns country, region, city, lat/lon) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.net.peer.ip`) | where cidrmatch(`attributes.net.peer.ip`, '\''10.0.0.0/8'\'') | fields traceId, `attributes.net.peer.ip`, serviceName | head 10"}'
JSON Functions
Functions for working with JSON data.
| Function | Syntax | Description |
|---|---|---|
JSON_EXTRACT |
json_extract(field, path) |
Extract value at JSON path |
JSON_KEYS |
json_keys(field) |
Get all keys from a JSON object |
JSON_VALID |
json_valid(field) |
Check if value is valid JSON |
JSON_ARRAY |
json_array(val1, val2, ...) |
Create a JSON array |
JSON_OBJECT |
json_object(key1, val1, ...) |
Create a JSON object |
JSON_ARRAY_LENGTH |
json_array_length(field) |
Length of a JSON array |
JSON_EXTRACT_PATH_TEXT |
json_extract_path_text(field, path) |
Extract value as text from JSON path |
TO_JSON_STRING |
to_json_string(field) |
Convert value to JSON string |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where json_valid(`attributes.gen_ai.tool.call.arguments`) | eval tool_args = json_extract(`attributes.gen_ai.tool.call.arguments`, '\''$'\'') | fields traceId, `attributes.gen_ai.tool.name`, tool_args | head 10"}'
Math Functions
Functions for mathematical operations.
| Function | Syntax | Description |
|---|---|---|
ABS |
abs(val) |
Absolute value |
CEIL |
ceil(val) |
Round up to nearest integer |
FLOOR |
floor(val) |
Round down to nearest integer |
ROUND |
round(val [, decimals]) |
Round to N decimal places |
SQRT |
sqrt(val) |
Square root |
POW |
pow(base, exp) |
Exponentiation |
MOD |
mod(a, b) |
Modulo (remainder) |
LOG |
log(val) |
Natural logarithm |
LOG2 |
log2(val) |
Base-2 logarithm |
LOG10 |
log10(val) |
Base-10 logarithm |
LN |
ln(val) |
Natural logarithm (alias for LOG) |
EXP |
exp(val) |
e raised to the power of val |
SIGN |
sign(val) |
Sign of value (-1, 0, 1) |
TRUNCATE |
truncate(val, decimals) |
Truncate to N decimal places |
PI |
pi() |
Value of π |
E |
e() |
Value of Euler's number |
RAND |
rand([seed]) |
Random float between 0 and 1 |
ACOS |
acos(val) |
Arc cosine |
ASIN |
asin(val) |
Arc sine |
ATAN |
atan(val) |
Arc tangent |
ATAN2 |
atan2(y, x) |
Two-argument arc tangent |
COS |
cos(val) |
Cosine |
SIN |
sin(val) |
Sine |
TAN |
tan(val) |
Tangent |
COT |
cot(val) |
Cotangent |
DEGREES |
degrees(radians) |
Convert radians to degrees |
RADIANS |
radians(degrees) |
Convert degrees to radians |
CONV |
conv(val, from_base, to_base) |
Convert between number bases |
CRC32 |
crc32(val) |
CRC-32 checksum |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = round(durationInNanos / 1000000.0, 2) | where duration_ms > 0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Relevance Functions
Full-text search functions for relevance-based querying.
| Function | Syntax | Description |
|---|---|---|
MATCH |
match(field, query) |
Full-text match on a single field |
MATCH_PHRASE |
match_phrase(field, phrase) |
Exact phrase match |
MATCH_BOOL_PREFIX |
match_bool_prefix(field, query) |
Boolean prefix match |
MATCH_PHRASE_PREFIX |
match_phrase_prefix(field, prefix) |
Phrase prefix match |
MULTI_MATCH |
multi_match([field1, field2], query) |
Match across multiple fields |
QUERY_STRING |
query_string([field1, field2], query) |
Lucene query string syntax |
SIMPLE_QUERY_STRING |
simple_query_string([field1, field2], query) |
Simplified query string |
HIGHLIGHT |
highlight(field) |
Return highlighted matching fragments |
SCORE |
score(relevance_func) |
Return relevance score |
SCOREQUERY |
scorequery(relevance_func) |
Filter by relevance score |
MATCH_QUERY |
match_query(field, query) |
Alias for MATCH |
WILDCARD_QUERY |
wildcard_query(field, pattern) |
Wildcard pattern match (* and ?) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where match(body, '\''timeout error'\'') | fields traceId, severityText, body | head 10"}'
Statistical Functions
Functions for computing statistical correlations and covariances.
| Function | Syntax | Description |
|---|---|---|
COVAR_POP |
covar_pop(field1, field2) |
Population covariance |
COVAR_SAMP |
covar_samp(field1, field2) |
Sample covariance |
Note:
corr()is not a recognized stats aggregation function in OpenSearch 3.x PPL. To approximate Pearson correlation, useevalwith manual calculation or compute covariance and standard deviations separately. Thecovar_sampandcovar_popfunctions are supported.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `attributes.gen_ai.usage.input_tokens` > 0 | stats covar_samp(`attributes.gen_ai.usage.input_tokens`, durationInNanos) as token_duration_covar"}'
String Functions
Functions for string manipulation.
| Function | Syntax | Description |
|---|---|---|
CONCAT |
concat(str1, str2, ...) |
Concatenate strings |
LENGTH |
length(str) |
String length in bytes |
LOWER |
lower(str) |
Convert to lowercase |
UPPER |
upper(str) |
Convert to uppercase |
TRIM |
trim(str) |
Remove leading/trailing whitespace |
LTRIM |
ltrim(str) |
Remove leading whitespace |
RTRIM |
rtrim(str) |
Remove trailing whitespace |
SUBSTRING |
substring(str, start [, len]) |
Extract substring |
LEFT |
left(str, len) |
Leftmost N characters |
RIGHT |
right(str, len) |
Rightmost N characters |
REPLACE |
replace(str, from, to) |
Replace occurrences |
REVERSE |
reverse(str) |
Reverse a string |
LOCATE |
locate(substr, str [, pos]) |
Position of substring |
POSITION |
position(substr IN str) |
Position of substring |
ASCII |
ascii(str) |
ASCII code of first character |
CHAR_LENGTH |
char_length(str) |
Character count |
CHARACTER_LENGTH |
character_length(str) |
Alias for CHAR_LENGTH |
OCTET_LENGTH |
octet_length(str) |
Byte count |
BIT_LENGTH |
bit_length(str) |
Bit count |
LPAD |
lpad(str, len, pad) |
Left-pad to length |
RPAD |
rpad(str, len, pad) |
Right-pad to length |
SPACE |
space(n) |
String of N spaces |
REPEAT |
repeat(str, n) |
Repeat string N times |
STRCMP |
strcmp(str1, str2) |
Compare strings (-1, 0, 1) |
SUBSTR |
substr(str, start [, len]) |
Alias for SUBSTRING |
MID |
mid(str, start, len) |
Alias for SUBSTRING |
FIELD |
field(str, val1, val2, ...) |
Index of str in value list |
FIND_IN_SET |
find_in_set(str, strlist) |
Position in comma-separated list |
FORMAT |
format(val, decimals) |
Format number with commas and decimals |
INSERT |
insert(str, pos, len, newstr) |
Insert string at position |
INSTR |
instr(str, substr) |
Position of first occurrence |
REGEXP |
regexp(str, pattern) |
Regex match (returns 1 or 0) |
REGEXP_EXTRACT |
regexp_extract(str, pattern [, group]) |
Extract regex capture group |
REGEXP_REPLACE |
regexp_replace(str, pattern, replacement) |
Replace regex matches |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval body_lower = lower(body) | where body_lower like '\''%exception%'\'' | eval short_body = left(body, 200) | fields traceId, severityText, short_body | head 10"}'
System Functions
| Function | Syntax | Description |
|---|---|---|
TYPEOF |
typeof(field) |
Returns the data type of a field value |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval type_of_duration = typeof(durationInNanos) | fields traceId, durationInNanos, type_of_duration | head 5"}'
Grammar Source
This PPL reference is sourced from the opensearch-project/sql repository's docs/user/ppl/ directory.
Repository: https://github.com/opensearch-project/sql
The PPL grammar is maintained as part of the OpenSearch SQL plugin. For the latest syntax additions and changes, consult the repository documentation directly.
References
- PPL Language Reference — Official PPL syntax documentation. Fetch this if queries fail due to OpenSearch version differences or new syntax.