doc-parser
Installation
SKILL.md
Document Parser
Overview
Parse complex documents containing tables, figures, multi-column layouts, headers, and mixed content using IBM's docling library. This skill goes beyond simple text extraction by understanding document structure, detecting layout regions, and preserving the logical reading order across complex formatting.
Instructions
When a user asks to parse a complex document or extract structured content from a document with tables, figures, or multi-column layouts, follow these steps:
Step 1: Install docling
pip install docling
Step 2: Load and convert the document
Use docling's DocumentConverter to parse the document:
Related skills