nano-pdf
nano-pdf
Purpose
This skill provides tools for PDF processing, including text extraction, mining, form filling, manipulation, and OCR integration, to handle document workflows efficiently.
When to Use
Use this skill for tasks involving PDF data extraction (e.g., from scanned documents), text analysis in reports, automating form submissions, merging/splitting files, or applying OCR to non-text PDFs. Apply it in data pipelines, document automation scripts, or when integrating with OCR services for unstructured data.
Key Capabilities
- Text extraction: Pulls plain text or structured data from PDFs, supporting encrypted files with passwords; uses OCR via Tesseract integration for image-based PDFs.
- Text mining: Analyzes extracted text for keywords, sentiment, or patterns; e.g., counts occurrences of phrases in a document.
- Form filling: Populates interactive PDF forms with JSON data; supports flattening forms to static PDFs.
- Manipulation: Merges, splits, rotates, or watermarks PDFs; handles up to 500-page documents efficiently.
- OCR integration: Converts scanned PDFs to searchable text using external APIs; requires Tesseract or similar engine configuration.
Usage Patterns
Invoke via CLI for quick scripts or API for server-side integration. For batch processing, chain commands in a shell script; for web apps, use API calls in loops. Always specify input/output paths explicitly. Pattern: Extract text first, then mine or manipulate as needed. For OCR-heavy tasks, preprocess images before PDF operations.
More from alphaonedev/openclaw-graph
playwright-scraper
Playwright web scraping: dynamic content, auth flows, pagination, data extraction, screenshots
1.4Kgcp-iam
Manages identity and access control for Google Cloud resources using IAM policies and roles.
371humanize-ai-text
AI text humanization: reduce AI-detection patterns, natural phrasing, tone adjustment
263macos-automation
AppleScript, JXA, Shortcuts, Automator, osascript, System Events, accessibility API
174tavily-web-search
Tavily: web search optimized for AI agents, answer synthesis, domain filtering, depth control
155clawflows
OpenClaw workflow automation: multi-step task chains, conditional logic, triggers, schedule
102