PaddleOCR Document Parsing

Parse images and PDF files using PaddleOCR's API. Supports both synchronous and asynchronous parsing modes with structured output.

Resource Links

Resource	Link
Official Website	https://www.paddleocr.com
API Documentation	https://ai.baidu.com/ai-doc/AISTUDIO/Cmkz2m0ma
GitHub	https://github.com/PaddlePaddle/PaddleOCR

Multi-format support: PDF and image files (JPG, PNG, BMP, TIFF)
Two parsing modes:
- Sync mode: Fast response for small files (<600s timeout)
- Async mode: For large files with progress polling
Layout analysis: Automatic detection of text blocks, tables, formulas