add-pdf-reader

star 178

Add PDF reading to ClaudeClaw agents. Extracts text from PDFs via pdftotext CLI. Handles WhatsApp attachments, URLs, and local files.

sbusso By sbusso schedule Updated 3/22/2026

name: add-pdf-reader description: Add PDF reading to ClaudeClaw agents. Extracts text from PDFs via pdftotext CLI. Handles WhatsApp attachments, URLs, and local files.

Add PDF Reader

Adds PDF reading capability to all container agents using poppler-utils (pdftotext/pdfinfo). PDFs sent as WhatsApp attachments are auto-downloaded to the group workspace.

Phase 1: Pre-flight

  1. Check if agent/skills/pdf-reader/pdf-reader exists — skip to Phase 3 if already applied
  2. Confirm WhatsApp is installed first (skill/whatsapp merged). This skill modifies WhatsApp channel files.

Phase 2: Apply Code Changes

Ensure WhatsApp fork remote

git remote -v

If whatsapp is missing, add it:

git remote add whatsapp https://github.com/qwibitai/claudeclaw-whatsapp.git

Merge the skill branch

git fetch whatsapp skill/pdf-reader
git merge whatsapp/skill/pdf-reader || {
  git checkout --theirs package-lock.json
  git add package-lock.json
  git merge --continue
}

This merges in:

  • agent/skills/pdf-reader/SKILL.md (agent-facing documentation)
  • agent/skills/pdf-reader/pdf-reader (CLI script)
  • poppler-utils in src/runtimes/docker/Dockerfile
  • PDF attachment download in src/channels/whatsapp.ts
  • PDF tests in src/channels/whatsapp.test.ts

If the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.

Validate

npm run build
npx vitest run src/channels/whatsapp.test.ts

Rebuild container

./src/runtimes/docker/build.sh

Service name: Derived from the directory name: com.claudeclaw.<dirname> (macOS) / claudeclaw-<dirname> (Linux). For example, if cwd is my-assistant, the service is com.claudeclaw.my-assistant. Determine the correct service name before running service commands below.

Restart service

launchctl kickstart -k gui/$(id -u)/com.claudeclaw  # macOS
# Linux: systemctl --user restart claudeclaw

Phase 3: Verify

Test PDF extraction

Send a PDF file in any registered WhatsApp chat. The agent should:

  1. Download the PDF to attachments/
  2. Respond acknowledging the PDF
  3. Be able to extract text when asked

Test URL fetching

Ask the agent to read a PDF from a URL. It should use pdf-reader fetch <url>.

Check logs if needed

tail -f logs/claudeclaw.log | grep -i pdf

Look for:

  • Downloaded PDF attachment — successful download
  • Failed to download PDF attachment — media download issue

Troubleshooting

Agent says pdf-reader command not found

Container needs rebuilding. Run ./src/runtimes/docker/build.sh and restart the service.

PDF text extraction is empty

The PDF may be scanned (image-based). pdftotext only handles text-based PDFs. Consider using the agent-browser to open the PDF visually instead.

WhatsApp PDF not detected

Verify the message has documentMessage with mimetype: application/pdf. Some file-sharing apps send PDFs as generic files without the correct mimetype.

Install via CLI
npx skills add https://github.com/sbusso/claudeclaw --skill add-pdf-reader
Repository Details
star Stars 178
call_split Forks 58
navigation Branch main
article Path SKILL.md
More from Creator