Explore AI Agent Skills & Claude Prompts

Declare the pipeline from data source to predictor as a **skrub DataOps graph** (not as a bare `sklearn.Pipeline`). Every step is either a pure-Python function (stateless) attached via `.skb.apply_func`, or a sklearn-compatible estimator (stateful) attached via `.skb.apply`. Stops at the declared object — no fit, split, tuning, persistence, or evaluation. TRIGGER — any of: - Writing or editing code that declares any link in the chain *data source → predictor*: loaders, preprocessing, encoders / imputers / scalers, feature steps, composition objects (`Pipeline`, `ColumnTransformer`, skrub `tabular_pipeline`, `nn.Module`), or the final estimator. - A pure-Python data-processing function destined for the pipeline path (cleans / derives / reshapes) — whether wrapped via `FunctionTransformer`, `skrub.@deferred` / `skrub.var`, a custom `BaseEstimator` subclass, or just called in the training path before the estimator. - A step is added, removed, swapped, or reordered inside an existing pipeline de

schedule Updated 12 days ago

iterate-ml-experiment

Owns the iteration loop on top of an ML workspace: the `journal/JOURNAL.md` index and the per-experiment `journal/NN_short_name.md` design notes that must be drafted and approved by the user **before** `experiments/NN_short_name.py` is created. Drives the propose → iterate → approve → implement → record loop; dispatches to `iterate-from-skore` / `iterate-from-user` for sourcing. TRIGGER — any of: - A session opens in an ML workspace (whether or not `journal/` exists yet — missing/placeholder → bootstrap mode). - User says "what's next", "resume", "where were we", "let's iterate", "propose next", "first baseline". - About to create a new `experiments/NN_*.py` (the matching `journal/NN_*.md` must exist and be approved first). - User wants to record an outcome from a finished run. - User asks to compare past experiments or review what's been tried ("compare X and Y", "where are we?"). SKIP when: no `journal/` yet AND no workspace scaffold (route to `organize-ml-workspace`); the work is mechanical inside

schedule Updated 12 days ago

iterate-from-skore

Source the next ML experiment proposal by **reading the audit digest** at `scratch/audit/<stem>/audit.md` (produced by `audit-ml-pipeline` at § 4 record-outcome). For every row in the digest's `## Checks summary` whose `severity` is `issue` or `tip`, follow the row's `documentation_url` to draft a Backlog row whose `Item` is the mitigation the docs recommend. The `## Metrics summary` provides context for the human summary paragraph but does not drive Backlog rows on its own. Returns the enriched Backlog rows + a one-paragraph summary back to `iterate-ml-experiment`, which writes the rows into `JOURNAL.md` and re-presents the sourcing menu so the user can promote a `B<N>` row. Stops at "Backlog enriched, summary returned"; never writes a per-experiment design note, never picks the "winning" finding — the user picks via `B<N>`. TRIGGER when: `iterate-ml-experiment` is picking a sourcing strategy and the user picks `skore` from the menu; the user says "mine the report", "what does skore see?", "fill the backlog

schedule Updated 29 days ago

iterate-from-user

Source the next ML experiment proposal from the user via one of three entry points selected by `AskUserQuestion`: (a) a scientific article URL the agent must read and synthesize, (b) a resource link or path (GitHub issue / spec file / reference repo), or (c) free-text the user types directly. In every branch, the agent reads the source, synthesizes its understanding of what to implement, and confirms with the user *before* returning the Proposal block. Hand the confirmed Proposal back to `iterate-ml-experiment`, which writes it into `journal/NN_short_name.md` and seeks the user's design-note approval. Stops at "Proposal returned, user-confirmed"; never writes a design note, never authors acceptance criteria. TRIGGER when: `iterate-ml-experiment` is picking a sourcing strategy and the user picks `user` from the menu; the user volunteers a concrete idea ("I want to try X"); the user pastes or links a scientific article, GitHub issue, spec file, or reference repo and asks us to read it. SKIP when: the user wants

schedule Updated 1 month ago

explore-ml-data

Owns data understanding BEFORE any model is designed. Places and executes `data/eda.py` (a jupytext `# %%` script) via the shared in-process runner, reads the streamed digest, then writes a persisted `data/eda.md` report (plus linked `data/eda_<table>.html` skrub `TableReport` pages) and the `## Data understanding (EDA)` section of `journal/JOURNAL.md`. The point is to surface the dataset facts — shape, dtypes, missingness, cardinality, target balance / skew, datetime / group structure, feature associations — that JUSTIFY the later learner / splitter / metric decisions, so the user understands *why* the modelling choices are made. Uses `skrub.TableReport` for dataframe overviews and the shared runner `audit-ml-pipeline/scripts/run_cells.py`. Stops at "EDA executed, `data/eda.md` + HTML written, JOURNAL EDA section updated." Never designs the model, never edits `src/<pkg>/`, never modifies the user's raw data files. TRIGGER — any of: - `iterate-ml-experiment` § 0 bootstrap, BEFORE the baseline design note —

schedule Updated 12 days ago

evaluate-ml-pipeline

star 25

Methodology for evaluating a single sklearn-compatible learner (in particular, the `SkrubLearner` produced by `build-ml-pipeline`). Owns: which entry point to call (`skore.evaluate` first, the explicit report classes when needed), which cross-validator to pick from scikit-learn's catalogue, how to consume the structural metadata (`groups`, `times`, …) attached at build time via `.skb.mark_as_X(split_kwargs=...)`. Stops at "what does the report say". Defaults (metrics, plots) come from skore; only override on explicit user request. TRIGGER when: code calls `cross_val_score`, `cross_validate`, `classification_report`, or any handwritten metric print (`print(mean_squared_error(...))`); code calls `.skb.cross_validate(...)` (route through skore for richer output); user asks how to score, evaluate, or compare a single learner; user asks how to pick a cross-validator; user wants to see a report / metrics / diagnostic plots for a fitted learner. SKIP when: declaring the pipeline (use `build-ml-pipeline`); hyperparam

schedule Updated 22 days ago

python-api

star 25

Look up the public API of a Python package against the *installed version* and cache what's worth keeping. Four shapes by question type: (0) cache hit under `scratch/api/<lib>/<version>/`; (1) `inspect.signature` + `pydoc.render_doc` for a symbol; (2) `dir` / `pkgutil.iter_modules` for a module surface; (3) WebSearch + WebFetch of versioned docs for narrative ("how", "which", "what does X return when Y"). Never write a symbol from training-data memory — recognition is not a lookup. TRIGGER — any of: - About to name a symbol (function / class / method / arg) in code. - User asks "what's the signature of X?", "what's in module Y?", "how do I call X?", "which of A/B should I use?". - User asks "what does X return when <condition>?" (Shape 3 — see decision table). - Another workflow skill (`build-ml-pipeline`, `evaluate-ml-pipeline`, `iterate-from-skore`, `smoke-test-ml-pipeline`) says "consult the API skill". - About to reach for a library's "obvious" pattern from memory. SKIP when: the signature is obvi

schedule Updated 29 days ago

sklearn-expert

star 7

This skill should be used whenever the user asks about machine learning with Python, scikit-learn, skrub, skore, tabular data pipelines, model selection, cross-validation, feature engineering, preprocessing, or any ML use case such as fraud detection, churn prediction, pricing models, sentiment analysis, anomaly detection, or survival analysis. Also trigger when the user is building a sklearn Pipeline, debugging an estimator, choosing between algorithms, evaluating a model, or working with dirty/heterogeneous tabular data. If there is any doubt, trigger this skill.