add-model

star 242

Add a new ML model to the TabArena benchmark system. Use this skill whenever the user wants to integrate a new tabular ML model into TabArena — even if they just say "add X model", "integrate X", "support X", or "wrap X for the benchmark". Creates all required files: the AutoGluon model wrapper, the search-space generator, the per-model `info.py`, a test, and the `pyproject.toml` extra. Reads existing similar models for inspiration and optionally fetches documentation URLs to understand the new model's API.

autogluon By autogluon schedule Updated 6/9/2026

name: add-model description: Add a new ML model to the TabArena benchmark system. Use this skill whenever the user wants to integrate a new tabular ML model into TabArena — even if they just say "add X model", "integrate X", "support X", or "wrap X for the benchmark". Creates all required files: the AutoGluon model wrapper, the search-space generator, the per-model info.py, a test, and the pyproject.toml extra. Reads existing similar models for inspiration and optionally fetches documentation URLs to understand the new model's API. argument-hint: [] [] user-invocable: true

Add Model to TabArena

This skill integrates a new tabular ML model into the TabArena benchmark.

Every model lives in one folder at packages/tabarena/src/tabarena/models/<ModelKey>/. That folder contains the wrapper, the HPO generator, and the metadata — and is auto-discovered by tabarena.models._registry.discover_models(). There is no separate benchmark/models/ag/ layout anymore.

Per model, you create up to 5 source files plus one test file, then edit two existing files.

Step 0: Gather inputs

Parse $ARGUMENTS for the model name. Then collect (ask only for what's missing or unclear):

Input Example Notes
ModelName "TabPFN-2.6" Human-readable display name
ModelKey "tabpfnv26" Snake_case folder/file key (derive from ModelName)
ClassName "TabPFNv26" CamelCase class prefix (derive from ModelName)
ag_key "TA-TABPFN-2.6" AutoGluon registry key; prefix with "TA-"
ag_name "TA-TabPFN-2.6" AutoGluon display name; same as ag_key with proper casing
pip_package "tabpfn>=7.0.0" Pip install spec for pyproject.toml
doc_url "https://..." Documentation / GitHub / paper URL
model_type foundation foundation, torch, or sklearn
supports_gpu true Whether the model uses GPU
problem_types binary,multiclass,regression Supported task types

Deriving keys: "TabPFN-2.6" → key "tabpfnv26", class prefix "TabPFNv26". "TabSTAR" → key "tabstar", class prefix "TabStar". Strip hyphens, lowercase for key; CamelCase for class.

Step 1: Understand the model API

If doc_url was provided, fetch it with WebFetch to understand:

  • Import path (e.g., from tabstar.tabstar_model import TabSTARClassifier)
  • Constructor parameters and their defaults
  • .fit(X, y, ...) signature
  • .predict() / .predict_proba() signature
  • Key hyperparameters to expose

Step 2: Pick the right base class and reference model

Choose the most similar existing model to read for detailed inspiration:

Model type Base class Read this reference model
Foundation / pre-trained / GPU (e.g. TabPFN, SAP-RPT-OSS, TabSTAR) AbstractTorchModel packages/tabarena/src/tabarena/models/sap_rpt_oss/model.py
Torch NN trained from scratch (e.g. TabM, RealMLP) AbstractTorchModel packages/tabarena/src/tabarena/models/tabm/model.py
CPU / sklearn-like (e.g. KNN) AbstractModel packages/tabarena/src/tabarena/models/knn/model.py

Read the reference model file now (use the Read tool). Use it as a structural guide — you will adapt rather than copy.

Also read the annotated patterns in references/model_patterns.md — it contains templates for model.py, hpo.py, info.py, and the test file.

Step 3: Create new files

Create these files (paths relative to the repo root):

3a. packages/tabarena/src/tabarena/models/{ModelKey}/__init__.py

Re-export the public symbols so from tabarena.models.{ModelKey} import ... works:

from __future__ import annotations

from tabarena.models.{ModelKey}.hpo import gen_{ModelKey}
from tabarena.models.{ModelKey}.info import {ModelKey}_info, {ModelKey}_method_metadata

__all__ = ["gen_{ModelKey}", "{ModelKey}_info", "{ModelKey}_method_metadata"]

3b. packages/tabarena/src/tabarena/models/{ModelKey}/model.py

The AutoGluon wrapper class. Use the template in references/model_patterns.md section "Model wrapper template". Key points:

  • Start with from __future__ import annotations
  • Inherit from AbstractTorchModel (GPU/torch models) or AbstractModel (CPU models)
  • Set ag_key, ag_name, ag_priority = 65, seed_name = "random_state"
  • Implement _fit(), _set_default_params(), supported_problem_types()
  • For GPU models: also implement get_device(), _set_device(), _get_default_resources(), get_minimum_resources(), _get_default_ag_args_ensemble() (with fold_fitting_strategy: sequential_local), _class_tags() (with can_estimate_memory_usage_static: False), _more_tags() (with can_refit_full: True)
  • Docstring must include: description, paper title, authors, codebase URL, license
  • Keep optional third-party imports (the wrapped library itself) inside _fit / per-method scope so importing this module never requires the optional dep at top-level

3c. packages/tabarena/src/tabarena/models/{ModelKey}/hpo.py

The search-space generator. By default use an empty search space (like TabPFN-2.6) — only add hyperparameters if the user explicitly asks or if the model has obvious tunable knobs. See template in references/model_patterns.md section "hpo.py template".

3d. packages/tabarena/src/tabarena/models/{ModelKey}/info.py

Defines {ModelKey}_method_metadata: MethodMetadata and {ModelKey}_info: ModelInfo. info.py is the single source the auto-discovery registry walks — populating it correctly is how the model becomes visible to discover_models(). See template in references/model_patterns.md section "info.py template".

3e. Multi-file support code (optional)

If the wrapper needs helper modules (preprocessors, vendored upstream code, large internal classes), put them in a private subfolder of packages/tabarena/src/tabarena/models/{ModelKey}/:

  • _internal/ — for hand-written helpers (preprocessors, internal classes, adapters)
  • _vendor/ — only for code copied verbatim from an upstream project; keep the original layout/license alongside

Both subfolders need their own empty __init__.py. Import them from model.py via absolute paths, e.g. from tabarena.models.{ModelKey}._internal.preprocessing import Preprocessor.

3f. tst/models/test_{ModelKey}.py

See template in references/model_patterns.md section "Test template". Include a minimal FitHelper.verify_model() call with model_hyperparameters={} (add a speed-up param if the model has one like max_epochs=1). Wrap the import in try/except ImportError and pytest.skip(...) so the test is automatically skipped when the optional dependency isn't installed.

Step 4: Edit existing files

Edit both locations in a single pass (read each file first, then edit):

4a. packages/tabarena/src/tabarena/models/__init__.py

Add a lazy entry for the new class so from tabarena.models import {ClassName}Model works:

_LAZY_CLASSES = {
    ...
    "{ClassName}Model": "tabarena.models.{ModelKey}.model",
    ...
}

Also add "{ClassName}Model" to __all__ and (under TYPE_CHECKING) to the static from tabarena.models.{ModelKey}.model import {ClassName}Model block, both kept alphabetised.

4b. packages/tabarena/src/tabarena/models/utils.py

Add to the name_to_import_map dict in get_configs_generator_from_name(). The key is the friendly model name (often the same as ModelName):

"{ModelName}": lambda: importlib.import_module("tabarena.models.{ModelKey}.hpo").gen_{ModelKey},

4c. packages/tabarena/pyproject.toml

The pyproject.toml defines a per-model extra for every supported model, plus three union extras built via self-references ("tabarena[<name>]"):

  • benchmark — the curated core set used for standard benchmarking. Stable and resolver-friendly. Do not add a new model here unless the user explicitly says it belongs in the core set.
  • extended — the layered set installed on top of benchmark for the broader model zoo. This is where most new models go.
  • all — experimental union of benchmark + extended + special-cased extras like probmetrics (which has conflict-prone deps and is excluded from extended on purpose). Updated automatically via tabarena[extended], so usually no manual edit needed unless the model is conflict-prone.

Always declare the pip spec exactly once in the per-model extra, then reference the model by name in the union(s). Never paste the raw {pip_package} into a union extra.

Step 1 — declare the per-model extra under [project.optional-dependencies]:

{ModelKey} = ["{pip_package}"]

Step 2 — add it to the right union via self-reference:

Situation Edit
Default: new extended model Add "tabarena[{ModelKey}]" to the extended extra.
Core benchmark model (only if user explicitly says so) Add "tabarena[{ModelKey}]" to the benchmark extra.
Model has known dependency conflicts (rare, like probmetrics) Skip both benchmark and extended; add "tabarena[{ModelKey}]" to all only.

After this, users can install the model alone (uv sync --extra benchmark --extra {ModelKey}), as part of the extended set (uv sync --extra benchmark --extra extended), or via --extra all.

Step 3 — verify the per-model extra matches info.py with the drift checker:

python -m tabarena.tools.sync_pyproject_extras

packages/tabarena/src/tabarena/tools/sync_pyproject_extras.py aggregates every ModelInfo.pip_extra from the registry and compares it against [project.optional-dependencies] in packages/tabarena/pyproject.toml, printing per-folder OK/DRIFT. Add --check to make it exit non-zero on drift (CI mode). Run it after editing either side so the two stay in sync.

Step 5: Auto-derived registries (no manual edit)

These pieces pick up the new model automatically once Step 3 lands — do not edit them by hand:

  • packages/tabarena/src/tabarena/models/_registry.pydiscover_models() walks tabarena/models/*/info.py and collects every ModelInfo found. As long as info.py exports a top-level ModelInfo instance, the model joins MODEL_REGISTRY.
  • packages/tabarena/src/tabarena/benchmark/exec_models/registry.py — auto-derives tabarena_model_registry from get_model_registry(), so the new class becomes available through the AG registry on the next import.

Step 6: Lint

Run ruff on the new files:

ruff check --fix packages/tabarena/src/tabarena/models/{ModelKey}/ tst/models/test_{ModelKey}.py

Fix any reported issues.

Step 7: Metadata artifact (optional — only if the model has been benchmarked)

If the model already has benchmark results to register in TabArena's artifact system, add a metadata entry to the dated batch file:

packages/tabarena/src/tabarena/nips2025_utils/artifacts/_tabarena_method_metadata_YYYY_MM_DD.py

Either add to the latest file or create a new dated file if the benchmarking run is new.

Each entry is a MethodMetadata(...) object (same class used in info.py, so the entry can be the {ModelKey}_method_metadata you already defined). Then import it in _tabarena_method_metadata.py:

from tabarena.nips2025_utils.artifacts._tabarena_method_metadata_YYYY_MM_DD import (
    {ModelKey}_metadata,
)

If the model has not been benchmarked yet, skip this step entirelyinfo.py already declares the metadata for the registry; the artifact entry is only needed when results files actually exist.

Step 8: Report

Summarize what was created/edited:

  • List new files created
  • List files edited and what was added
  • Note any TODOs left for the user (e.g., implementing _predict_proba if the library API is unclear, tuning ag_priority, adding a real search space later, registering benchmark artifacts after a real run)
Install via CLI
npx skills add https://github.com/autogluon/tabarena --skill add-model
Repository Details
star Stars 242
call_split Forks 48
navigation Branch main
article Path SKILL.md
More from Creator