pdf

Installation
Summary

Comprehensive PDF processing with text extraction, merging, splitting, form filling, and OCR capabilities.

  • Supports core operations: merge/split PDFs, extract text and tables, rotate pages, add watermarks, encrypt/decrypt, and extract images
  • Includes Python libraries (pypdf, pdfplumber, reportlab) and command-line tools (qpdf, pdftotext, pdftk) with ready-to-use code examples
  • Handles scanned PDFs via OCR using pytesseract and pdf2image for searchable text extraction
  • Dedicated form-filling workflow documented in FORMS.md; advanced features and JavaScript alternatives covered in REFERENCE.md
SKILL.md

PDF Processing Guide

Overview

This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see REFERENCE.md. If you need to fill out a PDF form, read FORMS.md and follow its instructions.

Quick Start

from pypdf import PdfReader, PdfWriter

# Read a PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")

# Extract text
text = ""
for page in reader.pages:
    text += page.extract_text()
Related skills

More from anthropics/skills

Installs
101.5K
GitHub Stars
132.3K
First Seen
Jan 20, 2026