paddleocr-text-recognition

Installation
Summary

Extract text from images, PDFs, and documents via PaddleOCR API with structured JSON output.

  • Supports URLs and local file paths for images and PDFs; returns complete recognized text in JSON format
  • Mandatory API-only approach: executes python scripts/ocr_caller.py with --file-url or --file-path parameters
  • Requires initial configuration with PADDLEOCR_OCR_API_URL and PADDLEOCR_ACCESS_TOKEN; displays full extracted text without truncation or summarization
  • Handles authentication, rate limiting, and empty results; stops immediately on API failure without fallback methods
SKILL.md

PaddleOCR Text Recognition Skill

When to Use This Skill

Trigger keywords (routing): Bilingual trigger terms (Chinese and English) are listed in the YAML description above—use that field for discovery and routing.

Use this skill for:

  • Extract text from images (screenshots, photos, scans)
  • Extract text from PDFs or document images when the goal is line/box-level text, not recovering table grids, formulas, or full reading-order layout
  • Extract text from URLs or local files that point to images/PDFs

Do not use for:

  • Plain text files, code files, or markdown documents that can be read directly as text
  • Documents with tables, formulas, charts, or complex layouts — use Document Parsing instead
  • Tasks that do not involve image-to-text conversion

Installation

Related skills
Installs
3.2K
GitHub Stars
22
First Seen
Feb 9, 2026