pdf

Installation
SKILL.md

PDF Processing

Gotchas

  • Never use Unicode subscript/superscript characters (₁, ², etc.) in ReportLab PDFs. Built-in fonts don't include these glyphs, rendering them as solid black boxes.
  • OCR requires the tesseract system binary, not just pip install pytesseract. On macOS: brew install tesseract.
  • Watermark PDFs must have transparent backgrounds. merge_page() composites content — a watermark with a white background will cover the document.

Instructions

Step 1: Identify the Operation

Determine what the user needs: read/extract text, merge, split, rotate, create, fill forms, OCR, watermark, encrypt/decrypt, or extract images. If the task involves filling a PDF form, read references/forms.md and follow its instructions instead of continuing here.

Step 2: Choose the Right Tool

Task Best Tool Command/Code
Merge PDFs pypdf writer.add_page(page)
Split PDFs pypdf One page per file
Related skills
Installs
22
First Seen
Feb 28, 2026