processing-invoices

star 0

Extracts, validates, and structures data from PDF invoices with automated validation and error correction loops. Use when processing invoice PDFs, extracting billing data, batch processing invoices, or when accuracy is critical.

letsgotoplay By letsgotoplay schedule Updated 2/5/2026

name: processing-invoices description: Extracts, validates, and structures data from PDF invoices with automated validation and error correction loops. Use when processing invoice PDFs, extracting billing data, batch processing invoices, or when accuracy is critical.

Invoice Processing

Workflow

Invoice Processing:
- [ ] Step 1: Log start time
- [ ] Step 2: Extract PDF text
- [ ] Step 3: Parse invoice fields
- [ ] Step 4: Validate (run validate_invoice.py)
- [ ] Step 5: Fix errors and re-validate if needed
- [ ] Step 6: Save final output AND eval log

Step 1: Log start time

Record the start time for eval tracking:

from datetime import datetime
start_time = datetime.now().isoformat()

Step 2: Extract text

from pypdf import PdfReader

reader = PdfReader("invoice.pdf")
text = ""
for page in reader.pages:
    page_text = page.extract_text()
    if page_text:
        text += page_text + "\n"

Step 3: Parse fields

Extract from text:

  • vendor: Company name (usually at document top, in larger font)
  • invoice_number: Look for "Invoice #", "Invoice No.", "INV-", "#"
  • date: Invoice/billing date -> convert to YYYY-MM-DD
  • total: Final amount ("Total:", "Amount Due:", "Balance:")
  • currency: Default USD if not specified

Step 4: Validate

Run: python scripts/validate_invoice.py output.json

Step 5: Fix and re-validate

If validation fails:

  1. Read the specific error message
  2. Re-examine PDF text for that field
  3. Update JSON with corrected value
  4. Run validation again
  5. Repeat until validation passes

Common issues: See TROUBLESHOOTING.md

Step 6: Save results

Save two files:

  1. Output file (requested by user):
{
  "vendor": "...",
  "invoice_number": "...",
  "date": "YYYY-MM-DD",
  "total": 0.00
}
  1. Eval log (always append to eval_results/all_evals.jsonl):
python scripts/collect_eval.py "<task_id>" "<original_task_prompt>" "<output_file>" "<notes>"

Example:

python scripts/collect_eval.py "invoice-basic" "Extract invoice data from invoice.pdf" "output.json" "validation passed on first attempt"

Always append to eval_results/all_evals.jsonl (one JSON per line) if it exists.

Output format

{
  "vendor": "Company Name",
  "invoice_number": "INV-2025-001",
  "date": "2025-01-15",
  "total": 1250.00,
  "currency": "USD",
  "line_items": []
}

Validation rules

See VALIDATION.md for complete rules.

Install via CLI
npx skills add https://github.com/letsgotoplay/invoice-skill-eval --skill processing-invoices
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
letsgotoplay
letsgotoplay Explore all skills →