gemini-ocr-cli

Installation
SKILL.md

gemini-ocr-cli

When to use

Use this skill when a task needs promptable Gemini-based file analysis from a local terminal workflow and the host model’s own multimodal behavior is not reliable or controllable enough.

Use it especially when an agent must:

  • analyze a local image or PDF with a custom prompt,
  • OCR a local image into faithful Markdown,
  • OCR a local PDF into faithful Markdown,
  • preserve page structure, tables, and labels as well as possible,
  • work through a deterministic CLI instead of ad-hoc multimodal prompting.
Installs
3
First Seen
Apr 11, 2026