name: convert-to-markdown description: > Convert documents to Markdown via the MinerU MCP server (mineru-converter). Supports both URLs and local file paths. Supported formats: PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG, HTML. Use when the user wants to: (1) convert a file or URL to Markdown, (2) extract text/tables/formulas from a document, (3) parse or read a PDF/DOC/PPT/image, (4) analyze document content, or (5) OCR an image or scanned PDF. This skill handles both local files and URLs. For URL-only conversion via the Smithery-deployed TypeScript server, use the mineru-convert skill instead.
Document to Markdown Conversion
Convert documents to Markdown using the mineru-converter MCP tools. Supports URLs and local file paths. The server auto-detects file type and configures optimal settings (model, OCR, etc.).
MCP Tools Reference
Primary tool — use this by default
convert_to_markdown— Complete workflow: submit → poll → download. Parameters:url(required): URL or local file pathoutput_path(required): local path to save the result zip (e.g.,./temp/report.zip)model_version(optional, default"vlm"): auto-detected;"MinerU-HTML"for HTML filesmax_wait_seconds(optional, default300): increase for large documents (e.g.,600for 100+ page PDFs)poll_interval(optional, default10)convert_pdf_to_markdownis an identical alias
Step-by-step tools — use when finer control is needed
create_parse_task— Submit a parsing task. Extra parameters:is_ocr(defaultfalse): force OCR on; auto-enabled for imagesenable_formula(defaulttrue): formula recognitionenable_table(defaulttrue): table recognition- Returns
task_id(URL input) orbatch_id(local file input)
get_task_status— Poll task progress. Passtask_idorbatch_id.download_result— Download the result zip. Parameters:zip_url,output_path.
Use step-by-step tools when the user needs to: disable formula/table recognition, force OCR, submit multiple tasks in parallel, or check a previously submitted task.
Workflow
1. Determine output path
- If the user provided an output path, use it (ensure it ends in
.zip). - Otherwise, derive from the filename:
./temp/<filename_without_ext>.zip- URL example:
https://example.com/report.pdf→./temp/report.zip - Local example:
C:\docs\slides.pptx→./temp/slides.zip
- URL example:
2. Call convert_to_markdown
Pass url (the URL or local file path as-is) and output_path. The server handles:
- Local file upload via batch API automatically
- HTML →
MinerU-HTMLmodel, images → OCR enabled - Large PDFs: >600 pages uses page ranges; >200MB splits the file into chunks
- Polling until completion or timeout
For large documents (100+ pages), set max_wait_seconds to 600 or higher.
3. Extract and read
After the zip downloads:
- Unzip to a sibling directory (same name without
.zip):unzip -o ./temp/report.zip -d ./temp/report - Find
.mdfiles inside the extracted directory. - Read the Markdown content.
- Note any referenced files (e.g.,
) — ignore all other files (JSON, content_list, etc.). - Do NOT delete the zip file.
4. Respond
- Analysis requested: Read the Markdown and answer the user's questions or provide a summary.
- Output path specified: Confirm save location and show the exact path of the
.mdfile(s). - No specific request: Show the extracted Markdown path and offer to analyze the content.
Rules
- Always use relative paths in shell commands. Never
cdto an absolute path. - Always auto-extract zip files after download. Do not delete the zip.
- Pass local file paths directly to the MCP tool — do not attempt to upload manually.
Error Handling
- Task creation / upload failure: Report the API error message.
- Timeout: Return the
task_idorbatch_idand suggest callingget_task_statusto check later. Consider retrying with a highermax_wait_seconds. - Download / extraction failure: Return the
full_zip_urlfor manual download.