project-docling-engineer

star 0

Engineer robust Python document-processing systems for this repository. Use when tasks involve Docling-first analysis, OCR/table fallback design, editable DOCX generation, RTL/LTR mixed text handling, project scaffolding, CI/tooling setup, or implementation planning and execution that must prioritize correctness, maintainability, and practical delivery.

qcdeveloper3-cmd By qcdeveloper3-cmd schedule Updated 2/21/2026

name: project-docling-engineer description: Engineer robust Python document-processing systems for this repository. Use when tasks involve Docling-first analysis, OCR/table fallback design, editable DOCX generation, RTL/LTR mixed text handling, project scaffolding, CI/tooling setup, or implementation planning and execution that must prioritize correctness, maintainability, and practical delivery.

Project Docling Engineer

Overview

Plan and implement production-grade changes for Python + Docling + DOCX workflows in this repo. Prioritize architecture clarity, testability, and staged delivery over quick but fragile code.

Workflow

  1. Confirm target outcomes and acceptance criteria before writing code.
  2. Inspect current state first (tree, rg, config, tests, CI).
  3. Propose minimal viable architecture changes with explicit tradeoffs.
  4. Implement in thin vertical slices:
    • keep stage boundaries clean (preprocess, analyze, render-docx, validate)
    • use interfaces/adapters for engines and fallbacks
    • keep IR stable and version-aware
  5. Add verification with every slice:
    • unit tests for pure logic and schema
    • CLI smoke tests for orchestration
    • artifact checks for deterministic outputs
  6. Report residual risks and clear next steps.

Implementation Rules

  • Keep modules cohesive; avoid monolithic stage files.
  • Preserve editability in DOCX output: prefer native paragraphs/tables/checkbox-like symbols before raster overlays.
  • Preserve geometry explicitly in IR; avoid lossy implicit conversions.
  • Treat mixed-direction text as first-class: store direction metadata on lines/spans/cells.
  • Make fallback behavior explicit and observable in logs/metadata.
  • Avoid hidden global state; pass context/config through stage boundaries.

Quality Gates

  • Run lint/format/type/test before finalizing:
    • ruff check src tests
    • ruff format --check src tests
    • mypy src
    • pytest
  • Run CLI smoke checks:
    • python -m docmirror --help
    • python -m docmirror run-all <sample.jpg> -o out
  • Validate that logs and debug artifacts are generated in configured paths.

Use Bundled References

  • For Docling integration details and option selection: read references/docling-implementation-guide.md.
  • For engineering behavior and delivery checks: read references/engineering-checklist.md.
  • For DOCX/RTL implementation notes: read references/docx-rtl-notes.md.

Use Bundled Script

  • scripts/run_quality_gate.py runs the standard local quality checks in one command.

Output Expectations

  • Deliver concrete code changes, validation evidence, and a short risk list.
  • Do not stop at abstract advice when implementation is feasible.
Install via CLI
npx skills add https://github.com/qcdeveloper3-cmd/new-chapter --skill project-docling-engineer
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
qcdeveloper3-cmd
qcdeveloper3-cmd Explore all skills →