paddleocr-text-recognition
Installation
Summary
Extract text from images, PDFs, and documents via PaddleOCR API with structured JSON output.
- Supports URLs and local file paths for images and PDFs; returns complete recognized text in JSON format
- Mandatory API-only approach: executes
python scripts/ocr_caller.pywith--file-urlor--file-pathparameters - Requires initial configuration with
PADDLEOCR_OCR_API_URLandPADDLEOCR_ACCESS_TOKEN; displays full extracted text without truncation or summarization - Handles authentication, rate limiting, and empty results; stops immediately on API failure without fallback methods
SKILL.md
PaddleOCR Text Recognition Skill
When to Use This Skill
Trigger keywords (routing): Bilingual trigger terms (Chinese and English) are listed in the YAML description above—use that field for discovery and routing.
Use this skill for:
- Extract text from images (screenshots, photos, scans)
- Extract text from PDFs or document images when the goal is line/box-level text, not recovering table grids, formulas, or full reading-order layout
- Extract text from URLs or local files that point to images/PDFs
Do not use for:
- Plain text files, code files, or markdown documents that can be read directly as text
- Documents with tables, formulas, charts, or complex layouts — use Document Parsing instead
- Tasks that do not involve image-to-text conversion
Installation
Related skills