name: stem-ai description: "Deterministic Evidence-Surface Scanner for Bio/Medical AI Repositories. Audits and reviews open-source bio/medical AI repositories for repository evidence-surface triage using a rubric-based 3-stage evaluation protocol with governance overlay. Produces scored review-priority reports with evidence chains. Supports 4 execution modes: LOCAL_ANALYSIS (AI CLI + local clone), FULL (web search + fetch), SEARCH_ONLY, and MANUAL. Use when asked to evaluate, audit, review, or assess evidence signals for any bio-AI, medical AI, or clinical-adjacent repository." version: "1.8.4" author: "Flamehaven" license: "Apache-2.0" platforms: ["claude-code", "codex", "gemini-cli", "cursor", "copilot", "antigravity", "universal"]
STEM BIO-AI -- Deterministic Evidence-Surface Scanner for Bio/Medical AI Repositories
Version: 1.8.4 Codename: Hippocratic_Code_Engine_Unified Runtime: LLM-Native + AI CLI (Universal)
"Code works. But does the author care about the patient? Governance without evidence is theater. Evidence without accountability is still not trust. Measurement beats interpretation."
When to Use This Skill
- Evaluating repository evidence signals for a bio-AI or medical AI repository
- Auditing open-source clinical-adjacent tools before procurement or pilot
- Assessing governance maturity of repositories handling patient data
- Generating structured audit reports with evidence chains and scores
- Comparing repository review-priority tiers across an ecosystem
- Producing institutional-grade documentation (Claim Matrix, Evidence Ledger)
What This Skill Produces
- STEM BIO-AI Audit Report -- scored repository evidence-surface triage (T0-T4 review-priority tier)
- Executive Summary -- 1-page institutional decision support
- Claim Matrix -- line-level evidence anchors for every finding
- Evidence Ledger -- snapshot provenance and artifact tracking
- Code Integrity Report -- C1-C6 findings (LOCAL_ANALYSIS only)
Audit Layering
STEM BIO-AI sits on top of technical audit. It should not replace it.
- Technical audit determines what the repository actually does.
- STEM BIO-AI determines whether the observable artifact surface is sufficient for institutional review triage.
Use this skill after or alongside technical inspection, not instead of it.
Quick Start
To audit a repository, provide:
- GitHub URL or README text
- (Optional) CHANGELOG, social media activity, CI/CD status
- (Optional) Governance overlay materials
The skill will:
- Detect execution mode (LOCAL_ANALYSIS / FULL / MANUAL)
- Run 3-stage evaluation (README Evidence Signal, Repo-Local Consistency, Code/Bio Responsibility)
- Score with fixed rubric (cross-LLM target: +/-10 points)
- Evaluate governance overlay if artifacts present
- Generate multi-file output package
Skill Architecture
stem-ai/
SKILL.md <-- You are here (entry point)
memory/ <-- MICA v0.2.4 memory layer (load first)
mica.yaml <-- composition contract
stem-ai.mica.v1.8.4.json <-- active archive (selected by mica.yaml)
stem-ai-playbook.v1.8.4.md <-- active session protocol (selected by mica.yaml)
stem-ai-lessons.v1.8.4.md <-- active lessons history (selected by mica.yaml)
spec/ <-- Core rubric, scoring, execution rules
discrimination/ <-- YES/NO example pairs for scoring consistency
templates/ <-- Output templates (report, claim matrix, etc.)
scripts/ <-- Automation scripts (scans, provenance)
references/ <-- Lookup tables (tiers, triggers, taxonomy)
examples/ <-- Real audit examples
Instructions
When activated, load files in this order:
Load MICA memory layer first (before any audit work):
- Load
memory/mica.yaml-- verify package structure and mode - Load the archive file referenced by
memory/mica.yaml-- activate 18 IMMUTABLE rules as design_invariants - Load the playbook file referenced by
memory/mica.yaml-- session protocol and rubric drift guard - Run
python tools/mica_pct.py .-- verify PCT-001 through PCT-011. Halt on PCT-001/002/003/004 failure. - Run
python tools/mica_runtime.py . --format text - Report:
[MICA READY] stem-ai-bio v1.8.4 | mode: protocol_evolution | invariants: 18 active | pct: CLOSED
- Load
Always load next:
spec/STEM-AI_v1.1.2_CORE.mdThis is the canonical rubric and execution instruction.Load on demand during Stage 1:
discrimination/H1-H6_examples.mdreferences/clinical_adjacent_triggers.md
Load on demand during Stage 3:
discrimination/T2_examples.mddiscrimination/B3_COI_guide.mddiscrimination/CA_severity_examples.md
Load if governance overlay detected:
discrimination/G1-G5_examples.md
Load for output generation:
templates/audit_report.mdtemplates/claim_matrix.mdtemplates/executive_summary.mdtemplates/evidence_ledger.md
Run in LOCAL_ANALYSIS mode:
scripts/local_analysis_scan.shscripts/ca_detection_scan.shscripts/snapshot_provenance.sh
Execution Modes
| Mode | Environment | Evidence Quality | C1-C4 |
|---|---|---|---|
| LOCAL_ANALYSIS | AI CLI + local clone | CODE_PATH (measurement) | Active |
| FULL | Online LLM + web tools | TEXT_PATH + web fetch | N/A |
| SEARCH_ONLY | Online LLM + search only | TEXT_PATH + search | N/A |
| MANUAL | Online LLM, no tools | TEXT_PATH only | N/A |
Tier Definitions
| Tier | Score | Meaning |
|---|---|---|
| T0 Rejected | 0-39 | Trust not established -- clinical use prohibited |
| T1 Quarantine | 40-54 | High risk -- independent verification required |
| T2 Caution | 55-69 | Research reference only -- clinical automation forbidden |
| T3 Review | 70-84 | Supervised clinical pilot eligible -- oversight mandatory |
| T4 Candidate | 85-100 | Strongest structural audit-readiness signal -- clinical deployment still requires independent validation |