name: research-source-dedup description: Use when deduplicating overlapping research materials, AI notes, excerpts, transcripts, and source packs.
Source deduplication workflow
Use this skill when multiple materials may overlap, repeat the same source, summarize the same source differently, or contain duplicated excerpts.
Steps
- Inventory all materials and assign stable source identifiers.
- Group likely duplicates and near-duplicates by title, URL, author, date, filename, excerpt overlap, and topic.
- Identify canonical sources when possible.
- Separate primary sources from summaries, AI-generated notes, commentary, and derivative materials.
- Preserve unique details even when sources overlap.
- Flag materials with missing metadata.
- Return a deduplication map before synthesis.
Do not
- Delete or overwrite original materials.
- Treat a summary as a primary source when the primary source exists.
- Merge conflicting versions without noting the conflict.
- Invent missing metadata.
Output
Return:
- Source inventory
- Duplicate groups
- Canonical source candidates
- Unique materials
- Missing metadata
- Recommended source set for synthesis