name: healthcare-predictive-modeling description: Build, tune, and validate supervised predictive models on healthcare data risk of readmission, cost, disease onset, no-show, denial. Emphasizes ML fundamentals done right train/validation/test and cross-validation design, the bias-variance tradeoff, regularization (L1/L2/early stopping), loss-function choice, class imbalance, probability calibration, and leakage-safe, point-in-time feature engineering. Covers scikit-learn, XGBoost/LightGBM, and PyTorch/TensorFlow. Use to develop a robust, well-evaluated clinical or operational predictive model. keywords: - predictive modeling - machine learning - bias-variance - regularization - cross-validation - calibration - xgboost - class imbalance - risk prediction - healthcare license: MIT metadata: author: MedClawMini version: "1.0.0" compatibility: - OpenClaw allowed-tools: - run_shell_command - web_fetch
Healthcare Predictive Modeling
Overview
This skill builds predictive models the right way with the statistical-learning discipline that separates a model that demos well from one that holds up in production. The emphasis is on methodology (validation design, bias-variance management, regularization, calibration, leakage control) as much as on algorithms, because in healthcare a miscalibrated or leaky model is worse than none.
When to Use This Skill
- Predicting readmission, total cost, disease onset/progression, appointment no-show, or claim denial.
- Any supervised tabular ML problem where you need defensible evaluation.
- Turning the feature tables from
spark-healthcare-data-pipelineinto a scored model. - A model that downstream needs explanations (
explainable-ml-healthcare) and validation (ml-model-validation-regulatory).
Method (ML fundamentals, applied)
- Frame & split define label and prediction time; split by patient and time (group + temporal split) so no patient/future leaks across folds.
- Leakage-safe features only information available at prediction time; fit all transforms inside the CV fold (pipelines, not pre-computed scalers).
- Baseline → complex start with regularized logistic regression, then gradient- boosted trees (XGBoost/LightGBM), then deep nets only if warranted.
- Bias-variance read train-vs-validation gaps and learning curves; high variance → more regularization/data; high bias → richer features/model.
- Regularization L1/L2, tree depth/min-child-weight, early stopping; tune with nested or grouped CV (never tune on test).
- Imbalance & loss pick the loss/metric for the cost structure (log-loss, focal, class weights, scale_pos_weight); resample thoughtfully.
- Calibration calibrate probabilities (Platt/isotonic) and check a reliability curve clinical decisions use the probability, not just the rank.
- Evaluate honestly AUROC and AUPRC for rare events, calibration, and decision- curve/net-benefit; report subgroup performance for fairness.
Example
from sklearn.model_selection import StratifiedGroupKFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
from sklearn.metrics import roc_auc_score, average_precision_score
import xgboost as xgb, numpy as np
cv = StratifiedGroupKFold(n_splits=5) # group=member_id prevents leakage
pipe = Pipeline([("scale", StandardScaler()),
("clf", LogisticRegression(penalty="l2", C=1.0, max_iter=1000))])
aucs, aps = [], []
for tr, va in cv.split(X, y, groups):
pipe.fit(X[tr], y[tr]); p = pipe.predict_proba(X[va])[:,1]
aucs.append(roc_auc_score(y[va], p)); aps.append(average_precision_score(y[va], p))
print(f"AUROC {np.mean(aucs):.3f} AUPRC {np.mean(aps):.3f}")
booster = xgb.XGBClassifier(max_depth=4, n_estimators=400, learning_rate=0.05,
subsample=0.8, reg_lambda=1.0, scale_pos_weight=(y==0).sum()/(y==1).sum())
cal = CalibratedClassifierCV(booster, method="isotonic", cv=5).fit(X, y) # calibrated probs
Outputs
- Serialized model + preprocessing pipeline (leakage-safe, reproducible).
model_card.mddata, intended use, metrics (AUROC/AUPRC/calibration), subgroup performance, limitations.cv_results.csvfold metrics, learning curves, hyperparameter search.- Calibrated probability scorer ready for
explainable-ml-healthcare.
Healthcare Context
Encodes the healthcare-specific failure modes: target leakage from post-outcome codes, patient-level (not row-level) splits, severe class imbalance, and the need for calibrated probabilities and subgroup fairness. Deep-learning frameworks (PyTorch/TensorFlow/Keras) slot in for text/image/sequence inputs; the validation discipline is identical.
References
- Hastie, Tibshirani & Friedman, Elements of Statistical Learning.
- scikit-learn model evaluation https://scikit-learn.org/stable/model_selection.html
- Van Calster et al. (2019), calibration of clinical prediction models; TRIPOD guidance.