skills/skills.volces.com/universal-pdf-vision-parser

universal-pdf-vision-parser

SKILL.md

Universal PDF Vision Parser Skill

Version: 0.1

This skill is a high-end multilingual document digitizer. It uses multimodal vision to 'look' at each PDF page, making it perfect for language learning notes, bilingual documents, and complex layouts that standard OCR fails to capture.

Prerequisites

  1. DashScope API Key: A valid key from Alibaba Cloud Bailian with qwen-vl-max access.
  2. Environment:
pip install pymupdf dashscope

Usage

Basic Command

Installs
6
First Seen
Apr 7, 2026