pdf-processing

star 1

Extract text and tables from PDF files.

pcingola By pcingola schedule Updated 1/24/2026

name: pdf-processing description: Extract text and tables from PDF files. compatibility: Requires PyMuPDF library.

PDF Processing

When to use this skill

Use this skill when the task involves reading, extracting, or transforming content from PDF documents.

How to use

For standard extraction, run the bundled script:

python scripts/extract.py <file.pdf>

The script reads the PDF and outputs structured content with page numbers and detected tables.

Output

Return extracted text with page numbers and any detected tables in a structured format.

Notes

If you encounter scanned PDFs or complex layouts, OCR processing may be required.

Install via CLI
npx skills add https://github.com/pcingola/book_agentic_patterns --skill pdf-processing
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator