name: configure-experiment description: "Use when setting up a DerivaML experiment project, adding config groups, or understanding how experiments compose. Triggers on: 'set up experiment', 'config groups', 'project structure', 'hydra defaults', 'DerivaModelConfig', 'experiment preset', 'new project from template'. Auto-fires in the experiment-lifecycle Phase 2 (configuration) moment, the seam where the lifecycle hands off here. Do NOT use for per-config-file Python syntax (use write-hydra-config) or for the up-front design doc (use design-experiment)."
Configure ML Experiments with hydra-zen and DerivaML
This covers the structure of a DerivaML experiment project: config groups, how they compose, and project setup. The table below is the orientation map — which groups exist and which file each lives in. For the exhaustive per-group key rules (required keys, PITFALLs, the Python API patterns for each config type), see /deriva-ml:write-hydra-config → references/config-reference/rules-and-validation.md.
Config Groups
| Group | Purpose | File |
|---|---|---|
deriva_ml |
Catalog connection (host, catalog ID) | configs/deriva.py |
datasets |
Dataset RIDs and versions | configs/datasets.py |
assets |
Pre-trained weights, reference files | configs/assets.py |
workflow |
What the code does | configs/workflow.py |
model_config |
Hyperparameters and architecture | configs/<model>.py |
notebook |
Notebook-specific configs | configs/<notebook>.py |
experiment |
Named combinations of the above | configs/experiments.py |
multiruns |
Sweeps over experiments/parameters | configs/multiruns.py |
How Experiments Compose
Base config (defaults for every group)
+ Experiment overrides (swap specific groups)
+ CLI overrides (fine-tune individual parameters)
Example: uv run deriva-ml-run +experiment=cifar10_quick loads base defaults, then overrides model_config and datasets from the experiment preset.
Critical Rules
- Every group needs a default —
default_deriva,default_dataset,default_asset,default_workflow,default_model - Pin dataset versions — Use
DatasetSpecConfig(rid="...", version="...")for reproducibility - Use meaningful names —
resnet50_extendednotconfig2 - Inspect before running — three distinct commands (don't confuse them):
uv run deriva-ml-run --list-configs— the menu of registeredgroup=valueoptions (ignores overrides; deriva-ml-specific)uv run deriva-ml-run +experiment=X --cfg job— the fully resolved config that experiment composes to, without executing (Hydra's native--cfg)uv run deriva-ml-run +experiment=X dry_run=true— resolve and validate every referenced RID/term against the live catalog, then stop before training Every Hydra command-line flag is forwarded to Hydra; see https://hydra.cc/docs/advanced/hydra-command-line-flags/ and the override grammar at https://hydra.cc/docs/advanced/override_grammar/basic/.
- Write goal-oriented experiment descriptions — The
descriptionfield on experiments and multiruns should state what question the experiment answers or what hypothesis it tests, not just list technical parameters. Technical details belong in the config; the description explains why the experiment exists.
Good experiment descriptions:
- "Test whether dropout 0.25 reduces overfitting on the small labeled split compared to the unregularized baseline"
- "Sweep learning rates to find the optimal convergence/stability tradeoff for the 2-layer CNN"
- "Evaluate whether the extended architecture (64→128 channels) improves accuracy enough to justify 10x training time"
Bad experiment descriptions (just restating parameters):
- "50 epochs, 64->128 channels, dropout 0.25, weight decay 1e-4"
- "Quick CIFAR-10 training with batch size 128"
Setup Steps
The config implements an approved experiment-design doc. Before writing config groups, you should have a
docs/design/experiment/<slug>.mdat Approved (see/deriva-ml:design-experiment). As you fill the groups below, cross-check that every Requirement in that design — the datasets/versions, assets, vocabularies — is satisfied by a config entry. A requirement with no config home is a gap to close before running.If you don't know whether a design doc exists for this work, look in
docs/design/experiment/for one matching the experiment (by slug) and read it before configuring — its Requirements are the contract this config implements.
- Clone the model template or create
configs/directory - Configure each group in order:
deriva.py→datasets.py→assets.py→workflow.py→<model>.py→base.py→experiments.py - Verify the config tree composes:
uv run deriva-ml-run --list-configs(the menu of registered options), thenuv run deriva-ml-run +experiment=<name> --cfg jobto confirm a specific experiment resolves
For the full project structure, base.py template, and setup walkthrough, read references/workflow.md.
Multiruns
A multirun runs multiple experiment configurations in a single command — parameter sweeps, model comparisons, or any combination. DerivaML creates a parent execution that links to one child execution per parameter combination, so results are grouped and traceable.
Two ways to define multiruns:
Named multiruns (multirun_config in configs/multiruns.py) — reproducible, documented sweeps:
from deriva_ml.execution import multirun_config
multirun_config(
"lr_sweep",
overrides=[
"+experiment=cifar10_quick",
"model_config.learning_rate=0.0001,0.001,0.01,0.1",
],
description="Learning rate sweep on small labeled split",
)
uv run deriva-ml-run +multirun=lr_sweep
Ad-hoc multiruns — comma-separated values on the CLI with --multirun:
uv run deriva-ml-run +experiment=quick,extended --multirun
Named multiruns are preferred because they're committed to the repo, self-documenting (the description appears on the parent execution), and don't require remembering the --multirun flag.
For the full multirun_config API, see the write-hydra-config skill.
Optional: Generate Experiments.md
For projects with many experiments, consider maintaining an Experiments.md file in the project root as a human-readable summary of all defined experiments. This is optional but helpful for discoverability.
- Read the config source —
experiments.py,multiruns.py, and any model config files they reference - Extract each experiment's name, config group overrides, key parameters (epochs, lr, batch size, architecture), and purpose
- Extract each multirun's name, overrides, sweep ranges, and description
- Write
Experiments.mdwith a quick-reference table, a multiruns table, and a detail section per experiment
If maintained, include Experiments.md in the same commit as the config changes — it should travel with the code it describes.
Format
# Experiments
Human-readable registry of all defined experiments and multiruns.
Generated from `src/configs/experiments.py` and `src/configs/multiruns.py`.
## Experiments
| Experiment | Model Config | Dataset | Description |
|------------|-------------|---------|-------------|
| `name` | `model_config_name` | `dataset_name` | Brief purpose |
## Multiruns
| Multirun | Overrides | Description |
|----------|----------|-------------|
| `name` | override summary | Brief purpose |
## Experiment Details
### `experiment_name`
- **Config group overrides**: `model_config=X`, `datasets=Y`
- **Parameters**: epochs, channels, batch size, learning rate, etc.
- **Purpose**: Why this experiment exists
Configuring storage locations in configs/deriva.py
DerivaML uses two distinct storage locations — a working directory (per-execution inputs/outputs/logs, ephemeral) and a cache directory (downloaded dataset bags and assets that persist across executions). The defaults work for most users, but you can override both in your hydra-zen config when needed.
For the conceptual difference, the on-disk layout, and the management commands (cleanup, garbage-collection, incomplete-execution recovery), see /deriva-ml:manage-deriva-storage — that skill owns the storage surface.
This skill owns only the config-authorship side: how to set working_dir and cache_dir in configs/deriva.py.
Setting custom locations
from hydra_zen import store
from deriva_ml import DerivaMLConfig
deriva_store = store(group="deriva_ml")
deriva_store(
DerivaMLConfig,
name="production",
hostname="ml.example.org",
catalog_id="52",
working_dir="/scratch/ml-work", # Fast local SSD for computation
cache_dir="/shared/ml-cache", # Large shared NFS for cached data
)
When to set a custom working_dir:
- Default
~/.deriva-ml/<hostname>/<catalog_id>/is on a small disk — redirect to a larger volume. - Running on a compute cluster — use a local scratch disk for speed.
- Shared environment — use a per-user directory on shared storage.
When to set a custom cache_dir:
- Team sharing — point to a shared NFS or network mount so downloaded bags and large assets are reused across team members. When one person downloads a 15 GB dataset, everyone else gets a cache hit instead of re-downloading. This is the most common reason to customize the cache directory.
- Disk management — keep the cache on a large, cheap volume separate from fast compute storage.
- Cluster environments — use a shared filesystem visible to all compute nodes.
- If not set, defaults to
<working_dir>/cache/.
Shared cache example:
# All team members point to the same shared cache
deriva_store(
DerivaMLConfig,
name="production",
hostname="ml.example.org",
catalog_id="52",
working_dir="/scratch/$USER/ml-work", # Per-user fast local disk
cache_dir="/shared/team-ml-cache", # Shared across team
)
When user A downloads dataset 28CT v0.9.0, the bag lands in /shared/team-ml-cache/. When user B runs an experiment referencing the same dataset and version, it's already there — no download needed.
⚠️ The working directory must NOT be inside the cache directory
If the working directory is a subdirectory of the cache directory (or vice versa), execution cleanup can delete cached data, or cache cleanup can delete active execution files. Always keep them as independent directory trees.
Good:
working_dir="/scratch/ml-work" # Fast local disk
cache_dir="/data/ml-cache" # Large shared disk
Bad:
working_dir="/data/ml-cache/work" # ❌ Working dir INSIDE cache dir
cache_dir="/scratch/ml-work/cache" # ❌ Cache dir INSIDE working dir
Reference Resources
Every MCP tool below takes
hostname=andcatalog_id=arguments explicitly. Substitute your catalog's hostname (e.g.,"data.example.org") and catalog ID (e.g.,"1") wherever the examples show them.
deriva://config/experiment-template— Experiment config templatederiva://config/multirun-template— Multirun config templatederiva://catalog/{hostname}/{catalog_id}/deriva-ml/workflows— Available workflows and types (or callderiva_ml_list_workflows(hostname=..., catalog_id=...))
Related Skills
design-experiment— Authors thedocs/design/experiment/<slug>.mdthis config implements. Write the design first; the config satisfies its Requirements.write-hydra-config— Exact Python API patterns for each config typeexecution-lifecycle— Pre-flight checklist and CLI commands for running