doc-condensor

star 1

Create accurate and concise documentation for a library/package by compressing documentation of the library/package into a single .md file.

sidmanale643 By sidmanale643 schedule Updated 1/24/2026

name: doc-condensor version: 1.0.5 description: | Create accurate and concise documentation for a library/package by compressing documentation of the library/package into a single .md file.

allowed-tools:

  • Read
  • Write
  • Edit
  • Grep
  • Glob
  • Bash
  • Web Search
  • mcp-get-docs

Doc Condensor Skill

Purpose

This skill helps Claude create condensed, accurate documentation summaries for libraries and packages. Instead of coding agents fetching documentation repeatedly, they can reference a single, well-organized markdown file containing all essential information.

When to Use This Skill

Use this skill when:

  • A user requests documentation summary for a specific library/package
  • A coding workflow needs quick reference documentation
  • Full documentation is too verbose for efficient agent usage
  • Latest documentation needs to be captured and condensed

Prerequisites

Before starting, ensure the mcp-get-docs MCP server is available:

Installation

If the MCP tools are not available, install the server:

npx -y @smithery/cli@latest install @cdugo/mcp-get-docs --client claude-code

Verification

After installation, verify the following tools are accessible:

  • fetch-url-docs
  • fetch-package-docs
  • fetch-library-docs
  • fetch-multilingual-docs

If tools are still unavailable after installation, inform the user and suggest:

  1. Restarting Claude Code
  2. Checking MCP server configuration in settings
  3. Manually verifying the installation with npx @smithery/cli list

Context Management - Streaming Writes

Large documentation sources can exhaust context windows, cause output truncation, and degrade quality on earlier sections. Use streaming writes to mitigate these issues.

When to Use Streaming Writes

Documentation Size Approach
< 40k tokens Standard single-pass processing
40k - 100k tokens Streaming writes (this section)
> 100k tokens Streaming writes + multi-file output

Streaming Write Protocol

Principle: Process one section at a time, write immediately, don't hold full content in memory.

Step 1: Initialize Output File

Create the file with metadata and a TOC placeholder:

# [Package Name] Documentation Summary

## Metadata
- **Version**: [version]
- **Last Updated**: [date]
- **Source**: [url]
- **Language/Ecosystem**: [language]

## Table of Contents
<!-- TOC will be updated at end -->

---

Step 2: Process Sections Incrementally

For each major section of the source documentation:

  1. Fetch/read only that section from the source
  2. Condense the section following quality guidelines
  3. Append to output file immediately using Edit tool
  4. Move on - do not retain the full section content in working memory

Appending technique using Edit:

oldString: [last line or marker of current file content]
newString: [last line or marker]

## [New Section Title]
[condensed content for this section]

Step 3: Review Selectively

When you need to check prior content:

  • Use Read with offset and limit parameters to read specific line ranges
  • Avoid re-reading the entire output file
  • Example: Read(filePath, offset=50, limit=30) to check lines 50-80

Step 4: Finalize

After all sections are written:

  1. Read just the headers to build final TOC
  2. Update the TOC placeholder with actual links
  3. Perform verification checks on specific sections (not full file)

Section Processing Order

Process documentation in this order to maximize coherence:

  1. Metadata + Overview (write first, establishes context)
  2. Quick Reference (high-value, frequently accessed)
  3. Core Concepts (foundational knowledge)
  4. Main Components (process each component as sub-iteration)
  5. API Reference (can be processed module-by-module)
  6. Common Use Cases
  7. Configuration & Setup
  8. Troubleshooting

Memory Management Tips

  • After writing a section, mentally "release" its details
  • Use the output file as external memory - read back specific parts if needed
  • For API references with many endpoints/functions, process in batches of 10-15
  • If source docs have a clear structure, match your processing to their sections

Markers for Append Operations

Use consistent end-of-section markers to make appending reliable:

---
<!-- END SECTION: [section-name] -->

This makes it easy to find the insertion point for the next section.

Web Search Tool Guide

Before invoking mcp-get-docs MCP server, use the Web Search tool to find the correct documentation URL.

Search Query Templates

Use these search patterns to find official documentation:

  • "[package name] official documentation"
  • "[package name] docs site:[official-domain]" (if you know the domain)
  • "[package name] [language] API reference"

Examples:

  • "FastAPI official documentation" → fastapi.tiangolo.com
  • "React docs site:react.dev" → react.dev
  • "pandas python API reference" → pandas.pydata.org

URL Validation Criteria

Before using a URL, verify it meets these criteria:

Check How to Verify
Official source Domain matches known official sites (e.g., .io, .dev, GitHub pages)
Current/maintained Check for recent updates, not archived or deprecated
Correct package Verify the package name matches exactly (watch for forks/alternatives)
Correct version URL points to desired version (not outdated major version)

When Multiple Sources Exist

If search returns multiple documentation sources:

  1. Prefer official project websites over third-party aggregators
  2. Prefer versioned docs over "latest" if user specified a version
  3. Ask user to confirm if genuinely ambiguous (e.g., React: react.dev vs legacy reactjs.org)

Usage

The mcp-get-docs MCP server provides several tools for fetching documentation. Choose the appropriate tool based on what information you have:

fetch-url-docs

When to use: You have the direct URL to the documentation website

Parameters:

  • url (required, string): Full URL of the library documentation
    • Example: https://docs.python-requests.org/en/latest/

Best for: Official documentation sites, custom documentation URLs, or when package name resolution might be ambiguous


fetch-package-docs

When to use: You know the package name and programming language

Parameters:

  • packageName (required, string): Exact package name as it appears in the package registry
    • Example: express, pandas, spring-boot
  • language (required, string): Programming language or ecosystem
    • Supported: javascript, python, java, dotnet, ruby, go, rust, php

Best for: Popular packages with standard naming in package registries (npm, PyPI, Maven, NuGet, etc.)


fetch-library-docs

When to use: Flexible option when you have either a package name or URL

Parameters:

  • library (required, string): Package name OR documentation URL
    • Example: react or https://react.dev
  • language (optional, string): Programming language (if providing package name)
    • Same options as fetch-package-docs

Best for: When you want a single tool that handles both cases, or when the source type is determined at runtime


fetch-multilingual-docs

When to use: The same package exists across multiple programming languages

Parameters:

  • packageName (required, string): Package name to search for
    • Example: grpc (exists in Python, JavaScript, Java, etc.)
  • languages (required, array of strings): List of languages to check
    • Example: ["python", "javascript", "java"]

Best for: Cross-platform libraries, protocol implementations, or when creating comparative documentation


Tool Selection Guide

Do you have the docs URL?
├─ Yes → Use fetch-url-docs
└─ No
   ├─ Single language?
   │  ├─ Yes → Use fetch-package-docs
   │  └─ Flexible → Use fetch-library-docs
   └─ Multiple languages?
      └─ Yes → Use fetch-multilingual-docs

Process

0. Verify Prerequisites

Before starting:

  • Confirm mcp-get-docs tools are available (see Prerequisites section)
  • Check if condensed docs already exist: ./compressed_docs/[language]/[package]-*-docs.md
  • If existing docs found, ask user: update existing or create new version?

1. Fetch URL of Documentation

Use the Web Search tool to find the correct documentation URL:

  • Search using query templates from the Web Search Tool Guide
  • Validate the URL using the criteria table
  • If multiple sources exist, confirm with user before proceeding

1.5. Gather Project Context (Optional)

If the user is working within a codebase:

  • Search for existing imports/usage of the package (use Grep)
  • Identify which modules/features are actively used
  • Note any version constraints from package.json, requirements.txt, etc.
  • Use this context to prioritize relevant sections in the summary

2. Fetch Documentation & Assess Size

  • Use the mcp-get-docs MCP server to retrieve the latest documentation
  • Verify the package/library name is correct
  • Estimate documentation size from the response
  • Choose processing mode:
    • < 40k tokens: Continue with standard single-pass (Steps 3-6)
    • = 40k tokens: Switch to Streaming Write Protocol (see Context Management section)

When using streaming writes, replace Steps 3-6 with the streaming protocol. The quality checks and output format remain the same.

3. Analyze Structure

  • Identify the high-level architecture of the library
  • Map out main modules, classes, and functions
  • Note key relationships and dependencies
  • Understand the intended use cases

4. Create Executive Summary

At the top of the output file, include:

  • Package name and version
  • Purpose: One-sentence description
  • Core concepts: 2-3 key concepts users should understand
  • Navigation guide: Where to find specific functionality
    • Example: "Authentication methods → Security section"
    • Example: "Data models → Core Classes section"

5. Organize Content

Structure the documentation summary as follows:

# [Package Name] Documentation Summary

## Metadata
- **Version**: [Package version number]
- **Last Updated**: [Date this summary was created]
- **Source**: [Official docs URL or package registry]
- **Language/Ecosystem**: [e.g., Python, JavaScript/Node.js, Java]

## Project Context
[How the package/library is being used in the existing codebase context - if applicable]

## Overview
[Brief description, installation, basic usage]

## Quick Reference
[Most commonly used functions/classes with minimal examples]

## Core Concepts
[Key ideas, architecture, design patterns]

## Main Components

### [Component/Module Name]
- **Purpose**: What it does
- **Key classes/functions**: List with brief descriptions
- **Common patterns**: How it's typically used

## API Reference (Condensed)
[Organized by module/category, including:]
- Function/method signatures
- Essential parameters
- Return types
- Brief usage notes

## Common Use Cases
[Practical examples for frequent scenarios]

## Configuration & Setup
[Important configuration options, environment variables]

## Troubleshooting
[Common issues and solutions if documented]

6. Quality Checks

Ensure the summary:

  • Preserves accuracy: All technical details are correct
  • Maintains completeness: No critical functionality is omitted
  • Achieves conciseness: Removes redundancy while keeping clarity
  • Enables quick lookup: Well-organized with clear headers
  • Includes examples: Code snippets for common operations
  • Uses consistent formatting: Uniform style throughout

Best Practices

Do:

  • Keep code examples minimal but functional
  • Use tables for parameter lists when appropriate
  • Include version information if available
  • Cross-reference related components
  • Highlight breaking changes or deprecations
  • Use collapsible sections for lengthy details (if the format supports it)

Don't:

  • Remove important warnings or security notes
  • Oversimplify complex concepts to the point of inaccuracy
  • Include marketing language from the original docs
  • Duplicate information across sections
  • Omit error handling in examples

Clarification Triggers

Ask the user before proceeding when:

Situation Why Ask What to Ask
Multiple official doc sources exist Avoid using outdated or wrong source "Found docs at X and Y. Which should I use?"
Package name is ambiguous Same name exists in multiple ecosystems "Did you mean [package] for Python or JavaScript?"
Version is critical and undetected User may need specific version docs "Couldn't detect version. Which version are you using?"
Documentation exceeds 100k tokens Need to prioritize sections "Docs are very large. Which sections are most important?"
Package appears internal/private May require different approach "This appears to be a private package. Can you provide the docs?"
Existing condensed docs found Avoid duplicate work "Found existing docs from [date]. Update or create new?"

Handling Existing Documentation

Before creating a new file, check for existing condensed docs:

Detection

ls ./compressed_docs/[language]/[package-name]-*-docs.md

If Found

  1. Compare versions: Is the existing doc for the same version?
  2. Check freshness: When was it created? (check metadata)
  3. Ask user:
    • "Update existing?" → Diff and update in place
    • "Create new version?" → New file with version suffix
    • "Replace entirely?" → Overwrite existing file

Update vs. New File

  • Update: Same major version, minor changes
  • New file: Different major version, breaking changes
  • Replace: User explicitly requests fresh start

Version Information

Version Detection Priority

When determining version, use this priority order (highest first):

Priority Source Example
1 User-specified "Create docs for pandas 2.1.0"
2 Package manifest package.json, requirements.txt, pom.xml
3 MCP server response Version in fetched metadata
4 Documentation content Version badge, header, URL path
5 Registry lookup Current latest from npm/PyPI
6 Fallback Use "latest" with disclaimer

Conflict Resolution

If multiple sources provide different versions:

  1. Prefer user-specified over all other sources
  2. Prefer project manifest over documentation (user's actual dependency)
  3. Note discrepancy in metadata if significant difference found
  4. Ask user if versions differ by major version

Example metadata note:

- **Version**: 2.1.0 (from requirements.txt)
- *Note: Documentation source shows v2.2.0. Some features may differ.*

Version Formatting in Output

Filename:

  • With version: [package-name]-v[version]-docs.md
    • Example: fastapi-v0.109.0-docs.md
  • Without version: [package-name]-latest-docs.md
    • Example: express-latest-docs.md

In Document:

## Metadata
- **Version**: 0.109.0
- **Last Updated**: January 22, 2026
- **Source**: https://fastapi.tiangolo.com/

Version Considerations:

  • Major versions: Note breaking changes if comparing versions
  • Beta/RC versions: Explicitly label as v2.0.0-beta.1
  • Date-based versions: Use as provided (e.g., 2024.01.15)
  • Multiple versions: If documenting multiple versions, create separate files or clearly section them

Output

File Location and Naming

Directory: ./compressed_docs/[language]/

  • Create the directory structure if it doesn't exist
  • Examples:
    • ./compressed_docs/python/pandas-v2.1.0-docs.md
    • ./compressed_docs/javascript/react-v18.2.0-docs.md
    • ./compressed_docs/java/spring-boot-v3.2.0-docs.md

Naming Convention:

[package-name]-v[version]-docs.md
  • Use lowercase, hyphen-separated package names
  • Include version with 'v' prefix
  • Use 'latest' if version is unknown
  • Examples:
    • tensorflow-v2.15.0-docs.md
    • express-latest-docs.md
    • fastapi-v0.109.0-docs.md

File Requirements

The output file should:

  1. Be immediately useful as a reference

    • Clear table of contents (auto-generated from headers)
    • Searchable with standard text search
    • Logical navigation flow
  2. Achieve optimal compression

    • Target: 60-80% less content than original
    • Retain: 100% of essential information
    • Remove: Redundancy, verbose explanations, marketing content
  3. Include proper metadata

    • Version number (when available)
    • Last updated timestamp
    • Source documentation URL
    • Language/ecosystem identifier
  4. Maintain quality standards

    • All code examples are syntax-highlighted with language tags
    • Cross-references use markdown links
    • Tables used for structured data (parameters, return values)
    • Consistent heading hierarchy (H1 → H2 → H3)

Output Verification Steps

Before finalizing, verify each item:

  1. Directory: File is in ./compressed_docs/[language]/
  2. Filename: Follows [package]-v[version]-docs.md convention
  3. Metadata: Section includes version, date, source URL, language
  4. Code blocks: All have language identifiers (python, bash, etc.)
  5. Navigation: Complex libraries have a navigation guide in overview
  6. Version: Accurate and matches source/user specification
  7. Source: Original documentation URL is included
  8. Compression: File is significantly smaller than original (~60-80% reduction)
  9. Completeness: No critical functionality was omitted

Multi-File Output (for large libraries)

If the library is extremely large (>100k tokens of documentation), consider:

Option 1: Modular Structure

./compressed_docs/python/django/
├── django-v5.0-overview.md
├── django-v5.0-models.md
├── django-v5.0-views.md
├── django-v5.0-templates.md
└── django-v5.0-orm.md

Option 2: Master + Modules

./compressed_docs/python/
├── django-v5.0-docs.md (master with navigation)
└── django-v5.0-modules/
    ├── models.md
    ├── views.md
    └── orm.md

Always create a master file that links to the modules and provides overall navigation.

Example Workflow

This example demonstrates the complete process for creating condensed documentation.

User Request

"Condense the FastAPI documentation for my Python project"

Step 0: Verify Prerequisites

Checking for mcp-get-docs tools... Available.
Checking for existing docs: ./compressed_docs/python/fastapi-*-docs.md... Not found.

Step 1: Find Documentation URL

Web Search: "FastAPI official documentation"

Results evaluated:

URL Official? Current? Selected
fastapi.tiangolo.com Yes (author's domain) Yes Yes
fastapi.readthedocs.io Redirect N/A No

Step 1.5: Gather Project Context

Scanning codebase for FastAPI usage...
Found: from fastapi import FastAPI, Depends, HTTPException
Found: requirements.txt shows fastapi==0.109.0
Key modules used: routing, dependency injection, exception handling

Step 2: Fetch Documentation

fetch-url-docs(url="https://fastapi.tiangolo.com/")

Response: 85,000 tokens of documentation fetched.

Size Assessment: 85k tokens exceeds 40k threshold → Use Streaming Write Protocol

Step 3: Initialize Output File (Streaming)

Write: ./compressed_docs/python/fastapi-v0.109.0-docs.md

Initial content:

# FastAPI Documentation Summary

## Metadata
- **Version**: 0.109.0
- **Last Updated**: January 23, 2026
- **Source**: https://fastapi.tiangolo.com/
- **Language/Ecosystem**: Python

## Table of Contents
<!-- TOC: Update after all sections complete -->

---
<!-- END SECTION: metadata -->

Step 4: Process Sections Incrementally (Streaming)

Processing: Overview section
  → Condensed to ~800 tokens
  → Appended to file using Edit tool
  → Section written, moving on

Processing: Quick Reference section
  → Condensed to ~600 tokens
  → Appended to file
  → Section written, moving on

Processing: Core Concepts (Path Operations)
  → Condensed to ~1,200 tokens
  → Appended to file
  ...

Processing: API Reference (batch 1: routing endpoints)
  → Condensed 15 endpoints to ~2,000 tokens
  → Appended to file

Processing: API Reference (batch 2: request handling)
  → Condensed to ~1,800 tokens
  → Appended to file
  ...

Step 5: Finalize (Streaming)

Read file headers (offset=0, limit=50) to build TOC
Update TOC placeholder with actual section links
Verify key sections present using targeted reads

Step 6: Verify

Created: ./compressed_docs/python/fastapi-v0.109.0-docs.md
Size: 12,000 tokens (86% reduction)
Context usage: Stayed under 40k tokens throughout processing

All checklist items confirmed.


Sample Output

Below is a truncated example of what the first section of a well-structured condensed doc looks like:

# FastAPI Documentation Summary

## Metadata
- **Version**: 0.109.0
- **Last Updated**: January 23, 2026
- **Source**: https://fastapi.tiangolo.com/
- **Language/Ecosystem**: Python

## Project Context
This summary prioritizes routing, dependency injection, and exception handling
based on current project usage patterns.

## Overview
FastAPI is a modern, high-performance web framework for building APIs with Python 3.8+
based on standard Python type hints.

**Installation**:
\`\`\`bash
pip install fastapi[all]
\`\`\`

**Minimal Example**:
\`\`\`python
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
def read_root():
    return {"Hello": "World"}
\`\`\`

## Quick Reference

| Operation | Code | Notes |
|-----------|------|-------|
| GET endpoint | `@app.get("/path")` | Most common read operation |
| POST endpoint | `@app.post("/path")` | Create resources |
| Path parameter | `@app.get("/items/{id}")` | Captured in function arg |
| Query parameter | `def read(skip: int = 0)` | Optional with defaults |
| Request body | `def create(item: Item)` | Pydantic model validation |

## Core Concepts

### Path Operations
Functions decorated with `@app.get()`, `@app.post()`, etc. that handle HTTP requests.

### Dependency Injection
Use `Depends()` to inject shared logic (auth, DB sessions, etc.):
\`\`\`python
from fastapi import Depends

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

@app.get("/users/")
def read_users(db: Session = Depends(get_db)):
    return db.query(User).all()
\`\`\`

[... continues with Main Components, API Reference, etc. ...]

Notes

  • If the MCP server returns partial documentation, note this limitation in the output file's metadata
  • For very large libraries, consider creating multiple summary files by module (see Multi-File Output section)
  • Always preserve code examples from the original documentation when they illustrate key concepts
  • Include links to the full documentation for deep dives

Error Handling

If documentation fetch fails:

  1. Inform the user which package/library failed
  2. Suggest alternatives:
    • Verify package name spelling
    • Try using fetch-url-docs with direct URL
    • Check if package exists in the specified language ecosystem
  3. Offer to summarize from user-provided documentation files

If documentation returns 403 Forbidden or is blocked:

Many documentation sites use bot protection (Cloudflare, etc.) that blocks automated requests.

IMPORTANT: NEVER USE THE zai built in webreader tool

Immediate workarounds:

  1. Try fetch-package-docs instead of URL-based fetch:
    fetch-package-docs(packageName: "pymilvus", language: "python")
    
  2. Try alternative URLs:
    • ReadTheDocs: Use Web Search with query "[package-name] ReadTheDocs" to find the correct URL (project names often differ from package names)
    • GitHub README: https://github.com/[org]/[repo] or search "[package-name] GitHub"
    • Specific doc page instead of root: /docs/quickstart or /docs/overview.md
  3. Try the package registry docs:
    • PyPI: https://pypi.org/project/[package]/
    • npm: https://www.npmjs.com/package/[package]

If all automated fetching fails:

  1. Inform user: "The documentation site is blocking automated access."
  2. Ask user to provide one of:
    • Raw markdown/text documentation files
    • Copy-pasted documentation content
    • Alternative documentation source URL
  3. Proceed with user-provided content using standard Read/analysis tools

Common blocked sites (known to have strong bot protection):

  • Sites behind Cloudflare with aggressive settings
  • Enterprise documentation requiring authentication
  • Some .io domains with rate limiting

If documentation is unusually large (>100k tokens):

  1. Ask user which sections are most important
  2. Create a focused summary on priority sections first
  3. Provide a table of contents for the full documentation
  4. Consider multi-file output structure

If version cannot be determined:

  1. Use "latest" in filename: [package-name]-latest-docs.md
  2. In metadata, mark as: **Version**: Latest (unspecified)
  3. Add note: *Version not detected from source. This summary may not reflect a specific release.*

If documentation source is ambiguous:

  1. Ask user to clarify or provide URL
  2. Default to most popular/official source if multiple exist
  3. Document which source was used in metadata

Install via CLI
npx skills add https://github.com/sidmanale643/doc-condensor --skill doc-condensor
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
sidmanale643
sidmanale643 Explore all skills →