universal-pdf-vision-parser
SKILL.md
Universal PDF Vision Parser Skill
Version: 0.1
This skill is a high-end multilingual document digitizer. It uses multimodal vision to 'look' at each PDF page, making it perfect for language learning notes, bilingual documents, and complex layouts that standard OCR fails to capture.
Prerequisites
- DashScope API Key: A valid key from Alibaba Cloud Bailian with
qwen-vl-maxaccess. - Environment:
pip install pymupdf dashscope
Usage
Basic Command