skills/skills.volces.com/image-ocr-local-AIPC

image-ocr-local-AIPC

SKILL.md

Image OCR (Windows · GLM-OCR · llama.cpp Vulkan)

Model: ggml-org/GLM-OCR-GGUF (Q8_0, HuggingFace / hf-mirror)
Inference: llama-cli (llama.cpp Vulkan prebuilt)
SKILL_VERSION: v1.0

Directory Structure (auto-created or user-specified)

<OCR_DIR>\                        ← auto-selected drive or user-specified (e.g. C:\image-ocr or D:\image-ocr)
├── llama.cpp\                    ← llama-cli.exe and related binaries
└── models\
    └── GLM-OCR-GGUF\
        ├── GLM-OCR-Q8_0.gguf        ← main model (~950 MB)
        └── mmproj-GLM-OCR-Q8_0.gguf ← vision projection layer (~484 MB, required)

Dependencies: Model files (GLM-OCR-Q8_0.gguf, mmproj-GLM-OCR-Q8_0.gguf) are downloaded via Python's huggingface_hub (hf download) or modelscope. If Python is not installed,

Installs
8
First Seen
Apr 23, 2026