glm-ocr-skill

star 0

OCR and layout-parsing skill for reading local files or remote PDF/PNG/JPG/JPEG links and producing markdown plus local image assets. Use this skill whenever the agent needs to read PDF content, OCR images/screenshots, or perform document layout parsing.

cs-qyzhang By cs-qyzhang schedule Updated 3/23/2026

name: glm-ocr-skill description: OCR and layout-parsing skill for reading local files or remote PDF/PNG/JPG/JPEG links and producing markdown plus local image assets. Use this skill whenever the agent needs to read PDF content, OCR images/screenshots, or perform document layout parsing. homepage: https://github.com/cs-qyzhang/glm-ocr-skill metadata: { "openclaw": { "emoji": "👓", "requires": { "bins": ["python3"], "env": ["GLM_API_KEY"] }, "primaryEnv": "GLM_API_KEY" }

}

OCR File/Image Extractor

Use this skill when the task needs:

  • Reading PDF text content
  • OCR for screenshots/images
  • Layout-aware parsing into Markdown

Use scripts/glm_ocr_extract.py as the default execution path.

First-Time Setup

Before first use, check whether <skill-dir>/.env exists.

If .env does not exist:

  1. Treat the skill as not initialized.
  2. Copy .env.example to .env.
  3. Instruct the user to manually edit the newly created .env file, and display the exact absolute file path for their reference.

Example command:

cp .env.example .env

Run

python3 scripts/glm_ocr_extract.py <local-file-or-url> [--output-dir <dir>]

Input

  • Local files: pdf, png, jpg, jpeg
  • Remote links: http(s) URLs to the same file types

Outputs

  • result.md: markdown with remote image links rewritten to local relative paths
  • images/: downloaded image assets referenced by result.md
  • response.json: raw OCR API response for debugging or structured post-processing, including layout block details such as bbox_2d, label, and content
  • result.raw.md: original markdown returned by OCR service

Env

Require GLM_API_KEY in environment variables or .env.

Install via CLI
npx skills add https://github.com/cs-qyzhang/glm-ocr-skill --skill glm-ocr-skill
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator