wandb-runs

star 0

Standardize W&B run lifecycle and logging. Use when creating or updating experiment runs with consistent naming, tags, config snapshots, and comparable metrics across iterations.

vivek100 By vivek100 schedule Updated 3/2/2026

name: wandb-runs description: Standardize W&B run lifecycle and logging. Use when creating or updating experiment runs with consistent naming, tags, config snapshots, and comparable metrics across iterations.

W&B Runs

Create comparable runs with stable naming and schema.

Execute

  1. Start each run with a deterministic name pattern (for example run_<n> plus optional slice metadata).
  2. Log immutable context at run start:
    • code version (git_sha)
    • prompt/tool version
    • dataset slice (offset, limit)
    • scorer version / dataset version
    • model identifier
  3. Log per-question metrics with explicit step indexing.
  4. Log run-level summary metrics at completion (accuracy, correct, total, error rates).
  5. Log prompt budget metrics when available (prompt_chars, prompt_tokens_est, budget status).
  6. Apply canonical tags (for example baseline, fix-batch, agent-vX).
  7. Keep key names stable between runs; avoid renaming metrics mid-series.

Fallback Order

  1. Use W&B SDK and MCP validation for run metadata sanity checks.
  2. If behavior differs from expectation, check official W&B run logging docs.
  3. If still unclear, inspect local SDK/source usage in the project codebase.

Output Contract

Return run metadata that can be joined to RCA/report pipelines:

{
  "run_id": "<wandb-run-id>",
  "run_name": "run_<n>",
  "git_sha": "<commit>",
  "slice": {"offset": 0, "limit": 100}
}
Install via CLI
npx skills add https://github.com/vivek100/jupyBot --skill wandb-runs
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator