[IMPORTANT] Use TaskCreate to break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ask user whether to skip.

Quick Summary

Goal: Convert PDF files to well-formatted Markdown with auto-detection of native text vs scanned documents. Only native-text conversion is implemented; OCR is planned.

Workflow:

Auto-Detect — Determine if PDF has native text or needs OCR
Convert — Run scripts/convert.cjs with input path and optional mode/output flags
Output — Returns JSON with success status, page count, and output path

Key Rules:

Use --mode auto (default) to let the tool decide native vs OCR
OCR for scanned PDFs requires additional tesseract.js setup
Complex multi-column layouts may not preserve structure perfectly

pdf-to-markdown

Quick Summary

pdf-to-markdown