pdf-extractor

star 149

Extract and convert PDF documents using Python scripts

maxvaega By maxvaega schedule Updated 12/1/2025

name: pdf-extractor description: Extract and convert PDF documents using Python scripts version: 1.0.0 allowed-tools: - Bash - Read - Write

PDF Extractor Skill

This skill provides tools for extracting text and metadata from PDF documents and converting them to different formats.

Available Scripts

extract.py

Extracts text and metadata from PDF files.

Input:

{
  "file_path": "/path/to/document.pdf",
  "pages": "all" | [1, 2, 3]
}

Output:

{
  "text": "Extracted text content...",
  "metadata": {
    "title": "Document Title",
    "author": "Author Name",
    "pages": 10
  }
}

convert.sh

Converts PDF files to different formats (text, markdown, etc.).

Input:

{
  "input_file": "/path/to/input.pdf",
  "output_format": "txt" | "md" | "html"
}

parse.py

Parses structured data from PDF forms and tables.

Input:

{
  "file_path": "/path/to/form.pdf",
  "extract_tables": true,
  "extract_forms": true
}

Usage Example

from skillkit import SkillManager

manager = SkillManager()
result = manager.execute_skill_script(
    skill_name="pdf-extractor",
    script_name="extract",
    arguments={"file_path": "document.pdf", "pages": "all"}
)

if result.success:
    print(result.stdout)
Install via CLI
npx skills add https://github.com/maxvaega/skillkit --skill pdf-extractor
Repository Details
star Stars 149
call_split Forks 20
navigation Branch main
article Path SKILL.md
More from Creator