extracting-mistral-ocr

Installation
SKILL.md

Mistral OCR PDF extraction

Quick start (default)

Run the bundled script to OCR a local PDF and write Markdown + JSON outputs:

python {baseDir}/scripts/mistral_ocr_extract.py --input path/to/file.pdf --out out/ocr

Output directory layout:

  • combined.md (all pages concatenated)
  • pages/page-000.md (per-page markdown)
  • raw_response.json (full OCR response)
  • images/ (decoded embedded images, if requested)
  • tables/ (separate tables, if requested)

Workflow

Related skills
Installs
59
GitHub Stars
1
First Seen
Feb 27, 2026