datafusion-docs

star 13

Search Apache DataFusion documentation, user guide, and API reference. Returns relevant documentation for a question or keyword. Searches the official DataFusion repository and website.

datafusion-contrib By datafusion-contrib schedule Updated 3/21/2026

name: datafusion-docs description: > Search Apache DataFusion documentation, user guide, and API reference. Returns relevant documentation for a question or keyword. Searches the official DataFusion repository and website. argument-hint: allowed-tools: Bash

You are helping the user find relevant Apache DataFusion documentation.

Query: $@

Follow these steps in order.

Step 1 — Extract search terms

If the input is a natural language question (e.g. "how do I create an external table"), extract the key technical terms: nouns, function names, SQL keywords. Drop stop words.

If the input is already a function name or technical term (e.g. APPROX_PERCENTILE_CONT, CREATE EXTERNAL TABLE), use it as-is.

Use the extracted terms as SEARCH_QUERY in the next steps.

Step 2 — Search the DataFusion source documentation

The DataFusion user guide is in the GitHub repo under docs/. Search it using gh:

Important: Do NOT quote multi-word search terms as a single string. Pass each word as a separate token so gh search code matches broadly. For example, use EXTERNAL TABLE not "EXTERNAL TABLE".

gh search code $SEARCH_QUERY --repo apache/datafusion --language markdown --limit 10

If gh is not available, fall back to the GitHub API:

gh api "search/code?q=$SEARCH_QUERY+repo:apache/datafusion+extension:md&per_page=10" --jq '.items[:10][] | "\(.path)"'

Step 3 — Search for SQL function documentation

DataFusion's built-in functions are documented in docs/source/user-guide/sql/. Check specifically:

gh search code "$SEARCH_QUERY" --repo apache/datafusion --language markdown --limit 5 -- path:docs/source/user-guide/sql/

Also list the available SQL doc files so you can fetch the most relevant one directly:

gh api "repos/apache/datafusion/contents/docs/source/user-guide/sql" --jq '.[].name' 2>/dev/null

Step 4 — Search for code examples

If the query is about API usage or implementation patterns, search Rust source code:

gh search code "$SEARCH_QUERY" --repo apache/datafusion --language rust --limit 5

Step 5 — Fetch and present relevant content

For the most relevant results (top 2-3), fetch the actual content:

gh api "repos/apache/datafusion/contents/<path>" --jq '.content' | base64 -d

If the file is too large, fetch just the relevant section. Look for the search terms in the content and extract the surrounding context (heading + content under that heading).

Step 6 — Present findings

Organize the results by relevance:

  1. Most relevant: Direct documentation for the queried topic
  2. Examples: Code examples showing usage
  3. Related: Related documentation that might be helpful

For each result, provide:

  • The document title or section heading
  • A brief summary of what it covers
  • The source URL (GitHub link to the file)
  • Key code snippets if applicable

Step 7 — Suggest follow-ups

If the search didn't find exactly what the user needed:

You can also check the DataFusion user guide at https://datafusion.apache.org/user-guide/ or the API docs at https://docs.rs/datafusion/latest/datafusion/

If the query is about a specific SQL function:

Try running datafusion-cli -c "SELECT * FROM information_schema.df_settings WHERE name LIKE '%<keyword>%';" to see related configuration options.

Quick reference — Common DataFusion topics

For faster lookups, here are paths to key documentation sections:

Topic Path in repo
SQL Reference docs/source/user-guide/sql/
Scalar Functions docs/source/user-guide/sql/scalar_functions.md
Aggregate Functions docs/source/user-guide/sql/aggregate_functions.md
Window Functions docs/source/user-guide/sql/window_functions.md
CREATE EXTERNAL TABLE docs/source/user-guide/sql/ddl.md
Data Types docs/source/user-guide/sql/data_types.md
Configuration docs/source/user-guide/configs.md
Python Bindings docs/source/user-guide/python/
Library Usage docs/source/library-user-guide/
Install via CLI
npx skills add https://github.com/datafusion-contrib/datafusion-skills --skill datafusion-docs
Repository Details
star Stars 13
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
datafusion-contrib
datafusion-contrib Explore all skills →