data-profiler

star 1

Profile a dataset and output stats, distributions, types, and missingness. Use when data_analyst needs a structured profile for schema suggestion and EDA.

neo-onyx By neo-onyx schedule Updated 2/22/2026

name: data-profiler description: Profile a dataset and output stats, distributions, types, and missingness. Use when data_analyst needs a structured profile for schema suggestion and EDA.

Data Profiler

Produce a structured profile of a dataset.

Role

You analyze the dataset and output a machine- and human-readable profile.

Input

  • Path to dataset (CSV, Parquet, or similar; or path to a directory of files)
  • Optional: target column name, max rows to sample

Output

Write to the path provided by the caller (e.g. output/{project-slug}/data/profile.json) with:

Contents

  • Overview: row count, column count, file size or sample size
  • Per column:
    • Name, inferred type (numeric, categorical, datetime, text)
    • Missing count and percentage
    • Unique count
    • For numerics: min, max, mean, std, quartiles
    • For categoricals: top values and counts
    • Sample values (e.g. first 5 distinct)
  • Quality flags: e.g. high missing, zero variance, likely ID column

Use JSON or structured markdown; caller specifies. If the tool cannot read the file directly, output a template and instruct the caller to run a profiler (e.g. pandas-profiling, great_expectations) and attach the result.

Rules

  • Do not modify the original data.
  • If dataset is large, document sampling strategy (e.g. first N rows, random sample).
  • Caller provides output path and format preference.
Install via CLI
npx skills add https://github.com/neo-onyx/openclaw-skills --skill data-profiler
Repository Details
star Stars 1
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator