evaluating-machine-learning-models

star 2.3k

Evaluate trained machine learning models with the right metrics and comparison logic. Use for benchmark review, threshold selection, calibration, validation, and model comparison; not for feature engineering or leakage auditing.

foryourhealth111-pixel By foryourhealth111-pixel schedule Updated 4/28/2026

name: evaluating-machine-learning-models description: | Evaluate trained machine learning models with the right metrics and comparison logic. Use for benchmark review, threshold selection, calibration, validation, and model comparison; not for feature engineering or leakage auditing. allowed-tools: Read, Write, Edit, Grep, Glob, Bash(cmd:*) version: 1.0.0 author: Jeremy Longshore jeremy@intentsolutions.io license: MIT

Model Evaluation Suite

Use this skill when the model exists and the question is whether it is good enough.

Overview

This skill focuses on choosing and interpreting the right evaluation metrics for the problem, then comparing candidate models or thresholds.

When to Use This Skill

  • Comparing candidate models with consistent metrics
  • Reviewing precision/recall/F1/AUC, regression error, calibration, or ranking quality
  • Stress-testing validation strategy before deployment or publication

Not For / Boundaries

  • Building the training pipeline itself: use scikit-learn for classical modeling or ml-pipeline-workflow for end-to-end workflow ownership
  • Engineering features: use preprocessing-data-with-automated-pipelines
  • Checking train/test contamination: use ml-data-leakage-guard

Typical Outputs

  • Metric suite recommendations
  • Model comparison tables
  • Notes on threshold tradeoffs, calibration, and validation weaknesses

Related Skills

  • scikit-learn for class-level error breakdowns and confusion matrices
  • scientific-reporting when the evaluation must become a deliverable
Install via CLI
npx skills add https://github.com/foryourhealth111-pixel/Vibe-Skills --skill evaluating-machine-learning-models
Repository Details
star Stars 2,311
call_split Forks 167
navigation Branch main
article Path SKILL.md
More from Creator
foryourhealth111-pixel
foryourhealth111-pixel Explore all skills →