stem-ai - SKILL.md Agent Skill

name: stem-ai description: "Deterministic Evidence-Surface Scanner for Bio/Medical AI Repositories. Audits and reviews open-source bio/medical AI repositories for repository evidence-surface triage using a rubric-based 3-stage evaluation protocol with governance overlay. Produces scored review-priority reports with evidence chains. Supports 4 execution modes: LOCAL_ANALYSIS (AI CLI + local clone), FULL (web search + fetch), SEARCH_ONLY, and MANUAL. Use when asked to evaluate, audit, review, or assess evidence signals for any bio-AI, medical AI, or clinical-adjacent repository." version: "1.8.4" author: "Flamehaven" license: "Apache-2.0" platforms: ["claude-code", "codex", "gemini-cli", "cursor", "copilot", "antigravity", "universal"]

STEM BIO-AI -- Deterministic Evidence-Surface Scanner for Bio/Medical AI Repositories

Version: 1.8.4 Codename: Hippocratic_Code_Engine_Unified Runtime: LLM-Native + AI CLI (Universal)

"Code works. But does the author care about the patient? Governance without evidence is theater. Evidence without accountability is still not trust. Measurement beats interpretation."

When to Use This Skill

Evaluating repository evidence signals for a bio-AI or medical AI repository
Auditing open-source clinical-adjacent tools before procurement or pilot
Assessing governance maturity of repositories handling patient data
Generating structured audit reports with evidence chains and scores
Comparing repository review-priority tiers across an ecosystem
Producing institutional-grade documentation (Claim Matrix, Evidence Ledger)

What This Skill Produces

STEM BIO-AI Audit Report -- scored repository evidence-surface triage (T0-T4 review-priority tier)
Executive Summary -- 1-page institutional decision support
Claim Matrix -- line-level evidence anchors for every finding
Evidence Ledger -- snapshot provenance and artifact tracking
Code Integrity Report -- C1-C6 findings (LOCAL_ANALYSIS only)

Audit Layering

STEM BIO-AI sits on top of technical audit. It should not replace it.

Technical audit determines what the repository actually does.
STEM BIO-AI determines whether the observable artifact surface is sufficient for institutional review triage.

Use this skill after or alongside technical inspection, not instead of it.

Quick Start

To audit a repository, provide:

GitHub URL or README text
(Optional) CHANGELOG, social media activity, CI/CD status
(Optional) Governance overlay materials

The skill will:

Detect execution mode (LOCAL_ANALYSIS / FULL / MANUAL)
Run 3-stage evaluation (README Evidence Signal, Repo-Local Consistency, Code/Bio Responsibility)
Score with fixed rubric (cross-LLM target: +/-10 points)
Evaluate governance overlay if artifacts present
Generate multi-file output package

Skill Architecture

stem-ai/
  SKILL.md                    <-- You are here (entry point)
  memory/                     <-- MICA v0.2.4 memory layer (load first)
    mica.yaml                 <-- composition contract
    stem-ai.mica.v1.8.4.json  <-- active archive (selected by mica.yaml)
    stem-ai-playbook.v1.8.4.md <-- active session protocol (selected by mica.yaml)
    stem-ai-lessons.v1.8.4.md  <-- active lessons history (selected by mica.yaml)
  spec/                       <-- Core rubric, scoring, execution rules
  discrimination/             <-- YES/NO example pairs for scoring consistency
  templates/                  <-- Output templates (report, claim matrix, etc.)
  scripts/                    <-- Automation scripts (scans, provenance)
  references/                 <-- Lookup tables (tiers, triggers, taxonomy)
  examples/                   <-- Real audit examples

Instructions

When activated, load files in this order:

Load MICA memory layer first (before any audit work):
- Load memory/mica.yaml -- verify package structure and mode
- Load the archive file referenced by memory/mica.yaml -- activate 18 IMMUTABLE rules as design_invariants
- Load the playbook file referenced by memory/mica.yaml -- session protocol and rubric drift guard
- Run python tools/mica_pct.py . -- verify PCT-001 through PCT-011. Halt on PCT-001/002/003/004 failure.
- Run python tools/mica_runtime.py . --format text
- Report: [MICA READY] stem-ai-bio v1.8.4 | mode: protocol_evolution | invariants: 18 active | pct: CLOSED
Always load next: spec/STEM-AI_v1.1.2_CORE.md This is the canonical rubric and execution instruction.
Load on demand during Stage 1:
- discrimination/H1-H6_examples.md
- references/clinical_adjacent_triggers.md
Load on demand during Stage 3:
- discrimination/T2_examples.md
- discrimination/B3_COI_guide.md
- discrimination/CA_severity_examples.md
Load if governance overlay detected:
- discrimination/G1-G5_examples.md
Load for output generation:
- templates/audit_report.md
- templates/claim_matrix.md
- templates/executive_summary.md
- templates/evidence_ledger.md
Run in LOCAL_ANALYSIS mode:
- scripts/local_analysis_scan.sh
- scripts/ca_detection_scan.sh
- scripts/snapshot_provenance.sh

Execution Modes

Mode	Environment	Evidence Quality	C1-C4
LOCAL_ANALYSIS	AI CLI + local clone	CODE_PATH (measurement)	Active
FULL	Online LLM + web tools	TEXT_PATH + web fetch	N/A
SEARCH_ONLY	Online LLM + search only	TEXT_PATH + search	N/A
MANUAL	Online LLM, no tools	TEXT_PATH only	N/A

Tier Definitions

Tier	Score	Meaning
T0 Rejected	0-39	Trust not established -- clinical use prohibited
T1 Quarantine	40-54	High risk -- independent verification required
T2 Caution	55-69	Research reference only -- clinical automation forbidden
T3 Review	70-84	Supervised clinical pilot eligible -- oversight mandatory
T4 Candidate	85-100	Strongest structural audit-readiness signal -- clinical deployment still requires independent validation