clinical-text-summarization - SKILL.md Agent Skill

name: clinical-text-summarization description: Summarize long clinical and biomedical text discharge summaries, progress-note bundles, radiology/pathology reports, and literature using extractive and abstractive methods. Covers transformer summarizers (BART/PEGASUS/T5, clinical/long-document variants), chunking for long inputs, and faithfulness/hallucination checking against the source. Use to produce a concise problem-oriented summary, a "one-liner" hospital course, or a literature digest while guarding against fabricated facts. keywords: - summarization - clinical nlp - abstractive - extractive - discharge summary - transformers - faithfulness - text mining license: MIT metadata: author: MedClawMini version: "1.0.0" compatibility: - OpenClaw allowed-tools: - run_shell_command - web_fetch

Clinical Text Summarization

Overview

Clinicians and analysts drown in text. This skill condenses long clinical documents into faithful, structured summaries. Because hallucination is unacceptable in healthcare, the skill pairs generation with an explicit faithfulness check that verifies summary claims against the source.

When to Use This Skill

Turning a multi-day note bundle into a problem-oriented hospital course one-liner.
Summarizing radiology/pathology reports into impressions.
Producing a literature digest from many abstracts (pairs with pubmed-search).
Compressing context before downstream extraction or QA.

Methods

Extractive (safe default) rank and select source sentences (TextRank, or embedding-centroid selection). Zero hallucination risk because every sentence is verbatim; best when faithfulness dominates.
Abstractive generate new phrasing with a transformer (BART/PEGASUS/T5; long-input variants like Longformer-Encoder-Decoder for >1k tokens; clinical-tuned models where licensing allows).
Hybrid extract salient sentences, then abstractively smooth them (extractive grounding reduces hallucination).
Long documents chunk → summarize chunks → summarize the summaries (map-reduce / refine), preserving section structure.

Example

from transformers import pipeline
summ = pipeline("summarization", model="facebook/bart-large-cnn")

def summarize_long(text, chunk=900):
    words = text.split()
    parts = [" ".join(words[i:i+chunk]) for i in range(0, len(words), chunk)]
    partials = [summ(p, max_length=130, min_length=30)[0]["summary_text"] for p in parts]
    return summ(" ".join(partials), max_length=160, min_length=40)[0]["summary_text"]

# Faithfulness guard: every summary entity must appear in the source
import medspacy
nlp = medspacy.load()
def unsupported_entities(summary, source):
    src = {e.text.lower() for e in nlp(source).ents}
    return [e.text for e in nlp(summary).ents if e.text.lower() not in src]
# non-empty list => possible hallucination => fall back to extractive

Evaluation

Report ROUGE and BERTScore for overlap with reference summaries, but treat them as necessary-not-sufficient they do not detect hallucination. Add a faithfulness/factual- consistency metric (entity-overlap above, or an NLI/QA-based check) and a clinician spot- review for high-stakes use. Prefer the extractive path whenever a hallucinated fact would be dangerous.

Outputs

summaries.parquet doc_id, summary, method, ROUGE/BERTScore, faithfulness flag.
flagged_summaries.csv summaries with unsupported claims for review.
summary_report.md method comparison and quality metrics.

Healthcare Context

Designed for the long, repetitive, template-heavy nature of clinical notes. De-identify PHI first. The faithfulness guard reflects the clinical-safety bar: a fluent-but-wrong summary is worse than a plain extractive one. Complements clinical-nlp-entity-extraction and clinical-text-search-elk.

References

Hugging Face summarization https://huggingface.co/docs/transformers/tasks/summarization
BART (Lewis et al. 2020); PEGASUS (Zhang et al. 2020); BERTScore (Zhang et al. 2020).
Survey: faithfulness/hallucination in clinical summarization.