PDF OCR

Overview

Extract readable text from scanned or image-based PDF documents using optical character recognition (OCR). This skill converts PDF pages to images, runs OCR to detect text, and outputs clean structured text. Handles multi-page documents, multiple languages, and low-quality scans with preprocessing.

Instructions

When a user asks to OCR a scanned PDF or extract text from an image-based PDF, follow these steps:

Step 1: Check if OCR is actually needed

First, attempt normal text extraction. If the PDF already contains selectable text, OCR is unnecessary:

import pdfplumber

def check_text_content(pdf_path):
    with pdfplumber.open(pdf_path) as pdf:

Related skills

pdf-ocr

PDF OCR

Overview

Instructions

Step 1: Check if OCR is actually needed

More from terminalskills/skills

api-tester

instagram-marketing

directus

coolify

agent-memory

reddit-insights