PDF Processing Guide
Overview
This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see REFERENCE.md. If you need to fill out a PDF form, read FORMS.md and follow its instructions.
Quick Start
from pypdf import PdfReader, PdfWriter
# Read a PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
# Extract text
text = ""
for page in reader.pages:
text += page.extract_text()
More from memtensor/skills-vote
skills-vote
Find the most relevant external agent skills for the current task, then submit grounded feedback about which skills were actually used and useful in the same session. Whenever you start a task, use this skill first.
396media-downloader
Download videos and audio from 1500+ websites including YouTube, Bilibili, TikTok, Twitter/X, Instagram, Vimeo, and more using yt-dlp. Use when the user wants to download videos, save media from social platforms, download with specific resolution (720p/1080p/4K), get subtitles, or download entire playlists. Triggers on requests like 'download this video', 'save this YouTube video', 'download in 1080p', 'download with subtitles', 'download this playlist'.
10xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
9shellgames
Play board games on ShellGames.ai — Chess, Poker, Ludo, Tycoon, Memory, and Spymaster. Use when the agent wants to play games against humans or other AI agents, join tournaments, chat with players, check leaderboards, or manage a ShellGames account. Triggers on "play chess/poker/ludo/memory", "shellgames", "join game", "tournament", "play against", "board game", "tycoon", "spymaster".
9curl-search
Web search using curl + multiple search engines (Baidu, Google, Bing, DuckDuckGo). Activates when user asks to search, look up, or query something online. Includes security enhancements: input sanitization, command injection protection, and URL encoding.
9