document-processing
Process, extract, and manipulate PDF, Excel, Word, and PowerPoint documents programmatically.
- Supports four major office formats (PDF, XLSX, DOCX, PPTX) with format-specific tools: pypdf and pdfplumber for PDFs, openpyxl and pandas for Excel, python-docx for Word, python-pptx for PowerPoint
- Core operations include text and table extraction, document merging and splitting, format conversion, and OCR for scanned PDFs
- Excel-specific guidance emphasizes writing formulas rather than static values for dynamic calculations, plus financial modeling conventions (color-coded text and fills)
- Word documents support tracked changes via XML editing for professional redlining; PowerPoint covers slide structure, speaker notes, and design principles for consistent layouts
Document Processing Guide
Work with office documents: PDF, Excel, Word, and PowerPoint.
Format Overview
| Format | Extension | Structure | Best For |
|---|---|---|---|
| Binary/text | Reports, forms, archives | ||
| Excel | .xlsx | XML in ZIP | Data, calculations, models |
| Word | .docx | XML in ZIP | Text documents, contracts |
| PowerPoint | .pptx | XML in ZIP | Presentations, slides |
Key concept: XLSX, DOCX, and PPTX are all ZIP archives containing XML files. You can unzip them to access raw content.
More from eyadsibai/ltk
file-organization
Use when "organizing files", "cleaning up folders", "finding duplicates", "structuring directories", or asking about "Downloads cleanup", "folder structure", "file management
336literature-review
Use when "literature review", "research synthesis", "systematic review", "academic search", or asking about "find papers", "cite sources", "research gaps", "meta-analysis", "bibliography
226resume-generator
Use when "tailoring resume", "job application", "CV customization", "ATS optimization", or asking about "resume writing", "career transition", "job description matching
138content-writing
Use when "writing articles", "blog posts", "content creation", "research writing", "technical writing", or asking about "outlining", "citations", "improving hooks", "writing feedback
120agent-browser
Use when automating browser interactions via CLI, filling forms, taking screenshots, scraping pages, or asking about "agent-browser", "browser automation", "headless browser", "web scraping", "form filling", "Vercel browser
103stripe-payments
Use when implementing payment processing, Stripe integration, subscription billing, checkout flows, webhooks, or asking about "Stripe", "payments", "subscriptions", "checkout", "PCI compliance", "webhooks", "refunds
102