kaggle - SKILL.md Agent Skill

name: kaggle description: "Generate a Kaggle competition notebook as a Jupytext `# %%` Python script following the user's established ML research style: PTL for DNN training, best-fit tool selection, EDA→Baseline→Train→Inference pipeline with per-stage lens cells. Writes output to .experiments/kaggle/.py." argument-hint: " [] [--type classification|regression|segmentation|detection|tabular] [--eda-only] [--inference-only] [--offline-setup] [--resume <existing.py>]" allowed-tools: Read, Write, Edit, Bash, Grep, Glob, Agent, WebFetch, WebSearch, AskUserQuestion, TaskCreate, TaskUpdate, TaskList disable-model-invocation: true effort: high

Generate a Kaggle competition notebook script in Jupytext # %% format.

Follows the user's ML research style distilled from past notebooks:

PTL always for DNN training (PyTorch Lightning + torchmetrics) — even simple baselines
Tool agnostic — pick best-fit library for the problem; use PTL when training loop needed
Stages with lenses — each major stage includes a quick sanity check cell (show one batch, print shapes, verify submission format)
! bash over subprocess — package installs, nvidia-smi, ls -lh, ! head submission.csv
EDA is visual — distribution plots, sample grids, dimension scatters before any model
Inference included — model save pattern + separate load-and-infer cells
CSVLogger + seaborn — metrics plotted from metrics.csv after every training run

NOT for writing Python packages, modules, or production code — notebook scripts only. NOT a research literature survey — use /research:topic for SOTA literature search.

$ARGUMENTS: one of:
- <competition-name> — short slug used for output filename; generates blank template
- <competition-name> <url> — fetches competition overview from URL before generating
- <competition-name> "<description>" — uses inline description of problem and data
- --type <type> — hint: classification, regression, segmentation, detection, tabular (auto-detected when omitted)
- --eda-only — generate only EDA sections (no model/training/submission); always online (no offline setup)
- --inference-only — generate inference notebook from checkpoint (no EDA, no training); always offline (frozen packages pattern); loads checkpoint from PATH_CHECKPOINT constant; output suffix -inference.py
- --offline-setup — include offline package setup (frozen_packages pattern) in setup cell; auto-applied when --inference-only; ignored when --eda-only (EDA always online)
- --resume <path> — read existing .py script and extend/improve it

Output: .experiments/kaggle/<competition-name>.py

OUTPUT_DIR:    .experiments/kaggle/
STYLE_GUIDE:   .temp/kaggle-style-distill.md   # auto-generated by /kaggle first run; regenerated if missing
CELL_MARK:     "# %%"
MD_CELL_MARK:  "# %% [markdown]"

Task hygiene: call TaskList first; close orphaned tasks. Create tasks for each phase.

Step 1: Parse arguments and gather context

ARGS="$ARGUMENTS"
COMPETITION_NAME=$(echo "$ARGS" | awk '{print $1}')
RESUME_FLAG=""
EDA_ONLY=false
INFERENCE_ONLY=false
OFFLINE_SETUP=false
PROBLEM_TYPE=""

[[ "$ARGS" == *"--eda-only"* ]]      && EDA_ONLY=true
[[ "$ARGS" == *"--inference-only"* ]] && INFERENCE_ONLY=true
[[ "$ARGS" == *"--offline-setup"* ]]  && OFFLINE_SETUP=true
[[ "$ARGS" =~ --type[[:space:]]([a-z]+) ]] && PROBLEM_TYPE="${BASH_REMATCH[1]}"
[[ "$ARGS" =~ --resume[[:space:]]([^[:space:]]+) ]] && RESUME_FLAG="${BASH_REMATCH[1]}"

# inference always offline; EDA always online (overrides --offline-setup)
[ "$INFERENCE_ONLY" = "true" ] && OFFLINE_SETUP=true
[ "$EDA_ONLY" = "true" ]       && OFFLINE_SETUP=false

echo "Competition: $COMPETITION_NAME"
echo "Type: ${PROBLEM_TYPE:-auto-detect}"
echo "EDA only: $EDA_ONLY | Inference only: $INFERENCE_ONLY | Offline setup: $OFFLINE_SETUP"

# Persist for Step 4 (bash state lost across Bash() calls)
echo "$COMPETITION_NAME" > "${TMPDIR:-/tmp}/kaggle-competition-name"
echo "$INFERENCE_ONLY"   > "${TMPDIR:-/tmp}/kaggle-inference-only"
echo "$OFFLINE_SETUP"    > "${TMPDIR:-/tmp}/kaggle-offline-setup"

mkdir -p .experiments/kaggle/  # timeout: 3000

Unsupported flag check — scan $ARGUMENTS for remaining --<token> tokens after supported flags extracted (--eda-only, --inference-only, --offline-setup, --type, --resume). If found: print ! Unknown flag(s): `--<token>`. Supported: `--eda-only`, `--inference-only`, `--offline-setup`, `--type <type>`, `--resume <path>`. then invoke AskUserQuestion — (a) Abort · (b) Continue ignoring. On Abort: stop.

Context collection — run in parallel:

Check if style guide exists at .temp/kaggle-style-distill.md; read it if present
If URL provided in args: WebFetch competition page; extract problem description, target metric, data format, evaluation — read and quote actual text, never paraphrase from training knowledge
If --resume: read existing script (Read tool)
Scan .experiments/kaggle/ (Glob pattern *.py) for prior scripts; read first 30 lines of each to find similar past competitions — use as structural reference

Grounding protocol — mandatory before Step 2:

Build a fact table. Each fact must have a source: [fetched], [user], [past-notebook:<file>], or [inferred-from:<fact>]. Never mark a fact [inferred] without citing the prior fact it derives from.

Fact	Value	Source
problem_type	?	?
input_modality	?	?
output_format	?	?
eval_metric	?	?
data schema (CSV columns / image format)	?	?
submission format	?	?

Gaps — ask before generating:

After building fact table, count facts still marked ? or [inferred] without a prior grounded fact. If ANY of these are unknown:

input_modality — cannot generate Dataset class
eval_metric — cannot choose torchmetric
submission format — cannot generate Submission section

Invoke AskUserQuestion with up to 4 questions covering all unknown required facts. Never guess or hallucinate competition-specific details (column names, file paths, data schema). State "unknown — will use placeholder" if user skips.

Acknowledge past-notebook similarity explicitly: "Found similar past notebook: <file> — reusing <pattern> from it."

Step 2: Determine problem profile

From gathered context, determine:

Property	Value
`problem_type`	classification / regression / segmentation / detection / tabular
`input_modality`	image-2d / image-3d / tabular / time-series / point-cloud / mixed
`output_format`	label / scalar / mask / bboxes / rle
`eval_metric`	AUC / F1 / RMSE / Dice / IoU / mAP / ...
`recommended_model`	see §Model selection below
`use_ptl`	true if DNN training; false for pure XGBoost/sklearn pipelines

Model selection rules (pick best-fit, not default):

Image classification → timm.create_model (EfficientNetV2, ConvNeXt, ViT-B) + PTL
Image regression → timm.create_model backbone (num_classes=0) + PTL regression head
Image segmentation → segmentation_models_pytorch (UNet/UNet++) + PTL; MONAI for 3D
Object detection → torchvision.models.detection or ultralytics YOLO + PTL wrapper if needed
Tabular → xgboost.XGBClassifier/Regressor with sklearn Pipeline; PTL only if DNN features needed
Point cloud → MONAI or pytorch3d; PTL always
Time series → torch.nn.LSTM or tsfresh features + XGBoost; PTL when DNN

PTL rule: use PTL whenever a training loop is needed — even for simple single-layer models. Exception: pure sklearn/XGBoost pipelines with no neural network component.

Step 3: Generate notebook script

Foundry availability check — verify before spawning:

FOUNDRY_AVAILABLE=$(ls -td ~/.claude/plugins/cache/borda-ai-rig/foundry/*/agents/sw-engineer.md 2>/dev/null | head -1)  # timeout: 5000
[ -z "$FOUNDRY_AVAILABLE" ] && { printf "⚠ foundry plugin not available — kaggle notebook generation requires foundry:sw-engineer\nInstall: claude plugin install foundry@borda-ai-rig\n"; exit 1; }

The spawn prompt is assembled from the inline problem profile (below) plus the section template loaded from the appropriate mode file:

_KAGGLE_MODES="${CLAUDE_PLUGIN_ROOT:-plugins/research}/skills/kaggle/modes"
TEMPLATE_FILE="$_KAGGLE_MODES/full.md"
[ "$EDA_ONLY" = "true" ] && TEMPLATE_FILE="$_KAGGLE_MODES/eda-only.md"
[ "$INFERENCE_ONLY" = "true" ] && TEMPLATE_FILE="$_KAGGLE_MODES/inference-only.md"

# Derive output filename from mode — must match template contract before spawning
OUTPUT_SUFFIX=""
[ "$INFERENCE_ONLY" = "true" ] && OUTPUT_SUFFIX="-inference"
OUTFILE=".experiments/kaggle/${COMPETITION_NAME}${OUTPUT_SUFFIX}.py"
echo "Output: $OUTFILE"

loads: full.md loads: eda-only.md loads: inference-only.md

Read $TEMPLATE_FILE — contains the required sections template. Pass to foundry:sw-engineer as continuation of the spawn prompt after the problem profile block below.

Spawn foundry:sw-engineer with this prompt preamble (inline, then continue with content from $TEMPLATE_FILE):

Write a complete Kaggle competition notebook script to `<OUTFILE>` (substitute expanded path from bash block above).

Format: Jupytext `# %%` Python script — every cell separated by `# %%` (code) or `# %% [markdown]` (markdown).

## Problem profile
- Competition: <competition-name>
- Problem type: <problem_type>
- Input: <input_modality>
- Output: <output_format>
- Metric: <eval_metric>
- Model: <recommended_model>
- Use PTL: <use_ptl>
- Description: <competition description if available>

[Continue with section template from $TEMPLATE_FILE]

Health monitoring (CLAUDE.md §6 defaults: 5-min poll, 15-min cutoff, +5-min extension): After spawning foundry:sw-engineer:

KAGGLE_CHECKPOINT="${TMPDIR:-/tmp}/kaggle-check-$(date +%s)"
touch "$KAGGLE_CHECKPOINT"  # timeout: 3000

Poll every 5 min:

NEW_FILES=$(find ".experiments/kaggle" -newer "$KAGGLE_CHECKPOINT" -type f 2>/dev/null | wc -l)  # timeout: 5000
echo "Activity check: $NEW_FILES new files since checkpoint"

On hard cutoff: read partial output, surface with ⏱ marker — do not silently omit.

Step 4: Verify and report

After agent completes:

Read first 30 lines of generated file to verify # %% structure
Count cell markers: grep -c "^# %%" .experiments/kaggle/<name>.py
Check all required sections present: grep "^# %% \[markdown\]" <file>

# Re-derive OUTFILE from flags persisted in Step 1 (bash state lost between steps)
COMPETITION_NAME=$(cat "${TMPDIR:-/tmp}/kaggle-competition-name" 2>/dev/null || echo "$COMPETITION_NAME")
INFERENCE_ONLY=$(cat "${TMPDIR:-/tmp}/kaggle-inference-only" 2>/dev/null || echo "false")
OUTPUT_SUFFIX=""; [ "$INFERENCE_ONLY" = "true" ] && OUTPUT_SUFFIX="-inference"
OUTFILE=".experiments/kaggle/${COMPETITION_NAME}${OUTPUT_SUFFIX}.py"
echo "=== Cell count ==="; grep -c "^# %%" "$OUTFILE"  # timeout: 5000
echo "=== Sections ===";   grep "^# %% \[markdown\]" "$OUTFILE"  # timeout: 5000
echo "=== File size ===";  wc -l "$OUTFILE"  # timeout: 5000

Print to terminal:

Output path ($OUTFILE)
Problem type + recommended model
Cell count and section list
Any missing required sections flagged with ⚠

Invoke AskUserQuestion as follow-up gate:

(a) Open in editor — ! code $OUTFILE
(b) Extend with additional sections
(c) Regenerate with different model/approach
(d) Done

On (a): run ! code "$OUTFILE" via Bash. On (b): re-enter Step 3 with extension directive. On (c): re-enter Step 2 with user-specified changes.

# %% format: Jupytext light format — compatible with VS Code Jupyter extension, JupyterLab, and jupytext --to notebook <file>.py. Each # %% starts a new code cell; # %% [markdown] starts a markdown cell where lines prefixed # are the markdown content.
! and % commands — NEVER convert: write ! cmd and %matplotlib inline verbatim in output scripts. Do NOT convert to get_ipython().system("cmd") or get_ipython().run_line_magic(...). Jupytext handles !/% magic natively; if a linter flags them, exclude .experiments/ from linting (pyproject.toml) — never rewrite the magic syntax.
PTL version compat: newer Lightning uses accelerator="auto", devices="auto" not gpus=1; use new API in generated code
Frozen packages pattern: Kaggle offline competition pattern — packages pre-downloaded as input dataset; ! pip install --no-index --find-links frozen_packages/ with fallback || pip install for online runs
Inference notebook pattern: each training notebook saves checkpoints to logs/; a companion notebook loads from checkpoint for inference — the script includes both inline + load-from-ckpt cells so the same file works both ways
Style guide regeneration: if .temp/kaggle-style-distill.md missing at Step 1, the style rules embedded in Step 3's generator prompt are the authoritative source — no style guide file required
Sharing context: competition notebooks are meant to be shared publicly as learning resources; clarity and educational value matter alongside score