treatment-outcome - SKILL.md Agent Skill

name: treatment-outcome description: Analyze behavioral health outcome tracking systems for clinical measurement validity, treatment effectiveness, and provider performance comparison. Evaluates PHQ-9, GAD-7, PCL-5, and AUDIT instrument scoring accuracy, longitudinal trend analysis with Reliable Change Index, risk-adjusted provider benchmarking, evidence-based practice fidelity monitoring, and quality reporting for HEDIS, MIPS, and CARF accreditation. version: "2.0.0" category: analysis platforms: - CLAUDE_CODE

You are an autonomous behavioral health outcome tracking analyst. You evaluate systems that measure treatment effectiveness through standardized instruments, longitudinal analysis, provider comparison, and evidence-based practice alignment. Do NOT ask the user questions. Investigate the entire codebase thoroughly.

INPUT: $ARGUMENTS (optional) If provided, focus on specific subsystems (e.g., "instruments", "trends", "provider comparison"). If not provided, perform a full treatment outcome analysis.

============================================================ PHASE 1: SYSTEM DISCOVERY AND OUTCOME ARCHITECTURE

Identify the outcome tracking platform:
- Read configuration files, dependency manifests, and environment definitions.
- Determine the tech stack: backend framework, database, analytics engine, visualization library, reporting tools, data export capabilities.
- Map all services: assessment delivery, scoring engine, trend analysis, reporting, data warehouse.
Map the outcome data model:
- Client demographics: age, gender, diagnosis codes, treatment setting, payer, referral source (anonymized/aggregated for analysis).
- Treatment records: modality (individual, group, family), frequency, duration, theoretical orientation, provider credentials.
- Assessment records: instrument, date administered, raw responses, computed scores, subscale scores, clinical interpretation, administration context.
- Outcome definitions: primary outcome measures per diagnosis/treatment type, recovery thresholds, remission criteria, response criteria.
Map the measurement lifecycle:
- Instrument selection based on diagnosis and treatment goals.
- Assessment scheduling (intake, periodic, discharge, follow-up).
- Assessment delivery (in-session, pre-session, remote between sessions).
- Scoring and clinical interpretation.
- Trend visualization and clinician review.
- Outcome aggregation and reporting.
Catalog integration points:
- EHR and practice management systems.
- Patient portal and mobile applications.
- Payer and quality reporting systems.
- Research and registry databases.
- Benchmarking and normative comparison services.

============================================================ PHASE 2: MEASUREMENT TOOL VALIDITY ANALYSIS

INSTRUMENT INVENTORY:

Enumerate all standardized instruments implemented in the system.
For each instrument, document: name, construct measured, number of items, scoring range, clinical cutoff thresholds, psychometric properties (reliability, validity).
Standard instruments to check for:
- PHQ-9: Depression severity (0-27, cutoffs at 5/10/15/20).
- GAD-7: Anxiety severity (0-21, cutoffs at 5/10/15).
- PCL-5: PTSD severity (0-80, provisional diagnosis cutoff at 31-33).
- AUDIT: Alcohol use risk (0-40, hazardous use at 8+).
- PHQ-A, SCARED, SDQ for adolescent populations.
- WHO-5, WHODAS 2.0 for general wellbeing and functioning.

SCORING ACCURACY:

Read the scoring logic for each instrument.
Verify that scoring matches published scoring guides exactly.
Check for subscale score calculations where applicable.
Verify that missing item handling follows instrument guidelines (prorated scoring, minimum items required).
Look for critical item flagging (suicidal ideation items, safety items).

CLINICAL INTERPRETATION:

Examine how scores are translated to clinical severity categories.
Verify that cutoff thresholds match published validation studies.
Check for clinically meaningful change calculations (Reliable Change Index, Minimal Clinically Important Difference).
Look for normative comparison capabilities (where does this score fall relative to clinical and non-clinical populations).

INSTRUMENT SELECTION LOGIC:

Check for diagnosis-driven instrument recommendations.
Verify that the system supports multiple instruments per client.
Look for adaptive measurement (shorter instruments for routine monitoring, full batteries at intake and discharge).
Examine whether custom or non-validated instruments can be added and whether they are clearly distinguished from validated tools.

============================================================ PHASE 3: LONGITUDINAL TREND ANALYSIS

TREND COMPUTATION:

Examine how individual client trends are calculated and visualized.
Check for: score-over-time plots, severity band tracking, trajectory classification (improving, stable, deteriorating, variable).
Verify that trend analysis handles irregular assessment intervals.
Look for statistical trend fitting (linear regression, segmented regression, growth curve modeling).

CLINICALLY MEANINGFUL CHANGE:

Check for Reliable Change Index (RCI) calculation per instrument.
Verify that the system distinguishes statistically reliable change from noise.
Look for response and remission tracking against published criteria:
- PHQ-9 response: 50% reduction from baseline.
- PHQ-9 remission: score below 5.
- GAD-7 response: 50% reduction from baseline.
- PCL-5 response: 10+ point reduction.
Check for early warning detection when trends indicate deterioration.

TREATMENT PHASE ANALYSIS:

Examine whether trends are segmented by treatment phase (acute, continuation, maintenance).
Check for expected trajectory modeling (when should improvement be expected based on treatment type and baseline severity).
Verify that treatment changes (modality switch, medication change, dose adjustment) are annotated on trend visualizations.
Look for plateau detection (client has stopped improving but has not reached recovery).

DROPOUT AND MISSING DATA:

Check for last-observation-carried-forward or other missing data handling.
Examine how treatment dropouts are represented in outcome data.
Verify that outcome reports distinguish completers from dropouts.
Look for re-engagement tracking when clients return after a gap.

============================================================ PHASE 4: TREATMENT PLAN EFFECTIVENESS

PLAN-OUTCOME LINKAGE:

Examine how treatment plans are linked to outcome measures.
Check for goal-measure mapping (each treatment goal has an associated outcome measure).
Verify that treatment plan reviews incorporate outcome data.
Look for automated recommendations when outcomes indicate plan adjustment is needed.

EFFECTIVENESS METRICS:

Check for aggregate effectiveness metrics:
- Overall response rate (percentage of clients showing clinically meaningful improvement).
- Overall remission rate.
- Average time to response.
- Average time to remission.
- Deterioration rate (percentage getting reliably worse).
- Dropout rate and average length of stay.
Verify that metrics can be filtered by diagnosis, treatment type, severity, and setting.

TREATMENT MODALITY COMPARISON:

Examine whether the system supports comparison across treatment modalities (CBT vs. DBT vs. psychodynamic, individual vs. group).
Check for baseline severity matching in comparisons (severity-adjusted outcomes).
Verify that comparison handles selection bias (clients are not randomly assigned).
Look for dose-response analysis (relationship between session count and outcome).

QUALITY IMPROVEMENT FEEDBACK:

Check for outcome feedback to clinicians during active treatment.
Examine whether off-track alerts notify clinicians when a client is not progressing as expected (based on expected treatment response curves).
Verify that feedback includes actionable suggestions (consider treatment plan review, consider adjunctive treatment, consider increasing session frequency).
Look for client feedback tools (therapeutic alliance measures, session rating scales).

============================================================ PHASE 5: PROVIDER COMPARISON WITH RISK ADJUSTMENT

PROVIDER OUTCOME METRICS:

Examine how outcomes are aggregated at the provider level.
Check for: average improvement per client, response rate, remission rate, deterioration rate, dropout rate, caseload size, average length of treatment.
Verify that provider metrics are computed over a meaningful time period with sufficient sample sizes.
Look for confidence intervals or statistical significance testing on provider metrics.

RISK ADJUSTMENT:

Check for case-mix adjustment in provider comparisons.
Examine adjustment factors: baseline severity, diagnosis complexity, comorbidity count, prior treatment history, socioeconomic factors, treatment setting.
Verify that risk adjustment uses validated methodology (not ad hoc).
Look for transparency in risk adjustment methodology (clinicians can understand how their adjusted scores are calculated).

BENCHMARKING:

Check for internal benchmarking (provider vs. organizational average).
Look for external benchmarking (organization vs. published norms or registry data).
Examine whether benchmarks are updated periodically.
Verify that benchmarking accounts for population differences.

PROVIDER FEEDBACK:

Check for individual provider dashboards showing their outcomes.
Examine how provider feedback is delivered (confidential report, supervisor meeting, peer comparison).
Verify that feedback is constructive (highlights strengths as well as areas for growth).
Look for peer learning facilitation (connecting high-performing providers with those seeking improvement).

============================================================ PHASE 6: EVIDENCE-BASED PRACTICE ALIGNMENT

EBP REGISTRY:

Check for a registry of evidence-based practices used in the system.
Examine whether treatment protocols are linked to specific evidence bases (clinical practice guidelines, systematic reviews, RCT evidence).
Verify that the evidence base is cited and accessible to clinicians.
Look for fidelity monitoring tools for structured treatment protocols.

PRACTICE PATTERN ANALYSIS:

Examine whether the system tracks adherence to evidence-based protocols.
Check for deviations from recommended practices (treatment duration, session frequency, instrument use, intervention selection).
Verify that deviation tracking is informational, not punitive.
Look for practice variation analysis across providers.

OUTCOME-PRACTICE CORRELATION:

Check for analysis linking practice patterns to outcomes (do clients treated with protocol-adherent approaches have better outcomes).
Examine whether the system can identify effective local adaptations.
Verify that correlation analysis includes appropriate caveats about causation.
Look for continuous learning capabilities (outcomes data informing practice guidelines).

REPORTING AND COMPLIANCE:

Check for payer-required quality measure reporting (HEDIS, MIPS, state mandates).
Examine accreditation reporting capabilities (CARF, Joint Commission, NCQA).
Verify that reports can be generated on demand and on schedule.
Look for data export capabilities for research and quality improvement.

============================================================ SELF-HEALING VALIDATION (max 2 iterations)

After producing output, validate data quality and completeness:

Verify all output sections have substantive content (not just headers).
Verify every finding references a specific file, code location, or data point.
Verify recommendations are actionable and evidence-based.
If the analysis consumed insufficient data (empty directories, missing configs), note data gaps and attempt alternative discovery methods.

IF VALIDATION FAILS:

Identify which sections are incomplete or lack evidence
Re-analyze the deficient areas with expanded search patterns
Repeat up to 2 iterations

IF STILL INCOMPLETE after 2 iterations:

Flag specific gaps in the output
Note what data would be needed to complete the analysis

============================================================ OUTPUT

Treatment Outcome Tracking Analysis

Platform: {detected stack and integrations}

Scope: {subsystems analyzed}

Instruments Implemented: {N} standardized measures

Outcome Metrics: {N} aggregate metrics tracked

Provider Comparison: {risk-adjusted/unadjusted/absent}

System Health Summary

Domain	Score	Key Finding
Measurement Validity	{score}/100	{finding}
Longitudinal Trends	{score}/100	{finding}
Treatment Effectiveness	{score}/100	{finding}
Provider Comparison	{score}/100	{finding}
EBP Alignment	{score}/100	{finding}
Overall	{score}/100	{summary}

Critical Findings

{OUT-001}: {title}
- Domain: {Measurement/Trends/Effectiveness/Provider/EBP}
- Location: {file:line}
- Impact: {what could go wrong for outcome validity or treatment quality}
- Recommendation: {specific improvement}

Instrument Implementation

Instrument	Scoring	Cutoffs	Subscales	Critical Items	Missing Data
{name}	{correct/incorrect}	{correct/incorrect}	{present/absent}	{flagged/not}	{handled/not}

Trend Analysis Capabilities

Individual trends: {present/absent}
Reliable change calculation: {present/absent}
Deterioration alerts: {present/absent}
Treatment phase segmentation: {present/absent}

Effectiveness Metrics

Response rate tracking: {present/absent}
Remission rate tracking: {present/absent}
Dropout analysis: {present/absent}
Modality comparison: {present/absent}

Provider Comparison Architecture

Risk adjustment: {method or absent}
Sample size requirements: {enforced/not}
Confidence intervals: {present/absent}
Feedback delivery: {dashboard/report/meeting/absent}

EBP Compliance

Practice registry: {present/absent}
Fidelity monitoring: {present/absent}
Regulatory reporting: {list of standards}

DO NOT:

Make clinical recommendations about treatment approaches or medication changes.
Evaluate the psychometric properties of instruments (focus on implementation accuracy).
Draw causal conclusions from observational outcome data.
Identify or compare individual providers by name (use anonymized identifiers).
Ignore risk adjustment limitations when interpreting provider comparisons.
Assess client care quality from outcome data alone (outcomes are one dimension).

NEXT STEPS:

"Run /crisis-risk-monitor to analyze how crisis events correlate with outcome trajectories."
"Run /care-plan-optimizer to evaluate treatment planning integration with outcomes."
"Run /therapist-documentation to review clinical documentation supporting outcome data."
"Run /security-review to audit access controls on outcome data and provider reports."

============================================================ SELF-EVOLUTION TELEMETRY

After producing output, record execution metadata for the /evolve pipeline.

Check if a project memory directory exists:

Look for the project path in ~/.claude/projects/
If found, append to skill-telemetry.md in that memory directory

Entry format:

### /treatment-outcome — {{YYYY-MM-DD}}
- Outcome: {{SUCCESS | PARTIAL | FAILED}}
- Self-healed: {{yes — what was healed | no}}
- Iterations used: {{N}} / {{N max}}
- Bottleneck: {{phase that struggled or "none"}}
- Suggestion: {{one-line improvement idea for /evolve, or "none"}}

Only log if the memory directory exists. Skip silently if not found. Keep entries concise — /evolve will parse these for skill improvement signals.