name: clinical-text-summarization description: Summarize long clinical and biomedical text discharge summaries, progress-note bundles, radiology/pathology reports, and literature using extractive and abstractive methods. Covers transformer summarizers (BART/PEGASUS/T5, clinical/long-document variants), chunking for long inputs, and faithfulness/hallucination checking against the source. Use to produce a concise problem-oriented summary, a "one-liner" hospital course, or a literature digest while guarding against fabricated facts. keywords: - summarization - clinical nlp - abstractive - extractive - discharge summary - transformers - faithfulness - text mining license: MIT metadata: author: MedClawMini version: "1.0.0" compatibility: - OpenClaw allowed-tools: - run_shell_command - web_fetch
Clinical Text Summarization
Overview
Clinicians and analysts drown in text. This skill condenses long clinical documents into faithful, structured summaries. Because hallucination is unacceptable in healthcare, the skill pairs generation with an explicit faithfulness check that verifies summary claims against the source.
When to Use This Skill
- Turning a multi-day note bundle into a problem-oriented hospital course one-liner.
- Summarizing radiology/pathology reports into impressions.
- Producing a literature digest from many abstracts (pairs with
pubmed-search). - Compressing context before downstream extraction or QA.
Methods
- Extractive (safe default) rank and select source sentences (TextRank, or embedding-centroid selection). Zero hallucination risk because every sentence is verbatim; best when faithfulness dominates.
- Abstractive generate new phrasing with a transformer (BART/PEGASUS/T5; long-input variants like Longformer-Encoder-Decoder for >1k tokens; clinical-tuned models where licensing allows).
- Hybrid extract salient sentences, then abstractively smooth them (extractive grounding reduces hallucination).
- Long documents chunk → summarize chunks → summarize the summaries (map-reduce / refine), preserving section structure.
Example
from transformers import pipeline
summ = pipeline("summarization", model="facebook/bart-large-cnn")
def summarize_long(text, chunk=900):
words = text.split()
parts = [" ".join(words[i:i+chunk]) for i in range(0, len(words), chunk)]
partials = [summ(p, max_length=130, min_length=30)[0]["summary_text"] for p in parts]
return summ(" ".join(partials), max_length=160, min_length=40)[0]["summary_text"]
# Faithfulness guard: every summary entity must appear in the source
import medspacy
nlp = medspacy.load()
def unsupported_entities(summary, source):
src = {e.text.lower() for e in nlp(source).ents}
return [e.text for e in nlp(summary).ents if e.text.lower() not in src]
# non-empty list => possible hallucination => fall back to extractive
Evaluation
Report ROUGE and BERTScore for overlap with reference summaries, but treat them as necessary-not-sufficient they do not detect hallucination. Add a faithfulness/factual- consistency metric (entity-overlap above, or an NLI/QA-based check) and a clinician spot- review for high-stakes use. Prefer the extractive path whenever a hallucinated fact would be dangerous.
Outputs
summaries.parquetdoc_id, summary, method, ROUGE/BERTScore, faithfulness flag.flagged_summaries.csvsummaries with unsupported claims for review.summary_report.mdmethod comparison and quality metrics.
Healthcare Context
Designed for the long, repetitive, template-heavy nature of clinical notes. De-identify PHI
first. The faithfulness guard reflects the clinical-safety bar: a fluent-but-wrong summary
is worse than a plain extractive one. Complements clinical-nlp-entity-extraction and
clinical-text-search-elk.
References
- Hugging Face summarization https://huggingface.co/docs/transformers/tasks/summarization
- BART (Lewis et al. 2020); PEGASUS (Zhang et al. 2020); BERTScore (Zhang et al. 2020).
- Survey: faithfulness/hallucination in clinical summarization.