gemini-document-processing

Installation
SKILL.md

Gemini Document Processing

Process and analyze PDF documents using Google Gemini's native vision capabilities. Extract structured information, summarize content, answer questions, and understand complex documents with text, images, diagrams, charts, and tables.

Core Capabilities

  • PDF Vision Processing: Native understanding of PDFs up to 1,000 pages (258 tokens/page)
  • Multimodal Analysis: Process text, images, diagrams, charts, and tables
  • Structured Extraction: Output to JSON with schema validation
  • Document Q&A: Answer questions based on document content
  • Summarization: Generate summaries preserving context
  • Format Conversion: Transcribe to HTML while preserving layout

When to Use This Skill

Use this skill when you need to:

  • Extract structured data from PDF documents (invoices, resumes, forms)
  • Summarize long documents or reports
  • Answer questions about PDF content
Related skills

More from mrgoonie/xxxnaper

Installs
3
GitHub Stars
1
First Seen
Mar 1, 2026