doc-intelligence-promotion

Installation
SKILL.md

Document Intelligence Promotion

Single-pass extraction + multi-stage post-processing pipeline.

Note: This pipeline uses pdfplumber for single-document extraction (not batch). For batch text extraction across the corpus, use pdftotext via subprocess — see pdf/pdftotext-poppler sub-skill.

Architecture

PDF/DOCX → parser (single read) → manifest.yaml
                            deep_extract.py (post-processors):
                            ├── table_exporter.py → CSV files
                            ├── worked_example_parser.py → pytest files
                            └── chart_extractor.py → images + metadata YAML
Installs
1
GitHub Stars
11
First Seen
Jun 1, 2026
doc-intelligence-promotion — vamseeachanta/workspace-hub