docling-cli

star 0

Convert documents to structured formats with OCR, table extraction, and AI-powered features. Use when working with PDF/DOCX/PPTX/images to Markdown/JSON/HTML conversion, table extraction, OCR on scanned PDFs, vision models for better extraction, or audio transcription.

mengbo By mengbo schedule Updated 3/11/2026

name: docling-cli description: Convert documents to structured formats with OCR, table extraction, and AI-powered features. Use when working with PDF/DOCX/PPTX/images to Markdown/JSON/HTML conversion, table extraction, OCR on scanned PDFs, vision models for better extraction, or audio transcription.

Docling CLI

Convert documents to structured formats with intelligent parsing, OCR, and AI enhancement.

Installation

uv tool install docling[asr,vlm]

Directory Setup

Create working directories in your project root:

mkdir -p import export
  • import/ - Source files to convert
  • export/ - Converted output files

Note: You can specify any output directory with --output, but examples below use export/ for consistency.

Quick Start

Check complete options before using:

docling --help

Basic conversion (most parameters have sensible defaults):

# Convert PDF to Markdown
docling --output export/ import/document.pdf

# Convert to JSON
docling --to json --output export/ import/document.pdf

# Batch convert entire directory
docling --output export/ import/

Default behavior:

  • Input format: auto-detects (PDF, DOCX, PPTX, images, etc.)
  • Output format: Markdown
  • Output directory: current directory
  • OCR: enabled
  • Table extraction: enabled
  • Image export: embedded

Image Export Mode

Choose image export mode based on your needs:

Mode Description When to use
placeholder Only mark image positions When user doesn't need images
embedded Base64-encoded images (default) When user needs images but wants single-file output
referenced Export as PNG files, reference in document When user needs separate image files
docling --to json --image-export-mode referenced --output export/ import/document.pdf

Platform Optimization

macOS / Apple Silicon

Recommend using VLM pipeline for better performance, especially for scanned PDFs or documents with images:

docling --pipeline vlm --output export/ import/document.pdf

VLM pipeline with default model works well for documents with Chinese image content.

Audio Transcription

Convert audio files to text using ASR models:

# Use default whisper_tiny model
docling --pipeline asr --output export/ import/audio.mp3

# For better accuracy with Chinese
docling --pipeline asr --asr-model whisper_large --output export/ import/audio.mp3
Install via CLI
npx skills add https://github.com/mengbo/mengbo-skills --skill docling-cli
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator