deepseek-ocr

Installation
SKILL.md

DeepSeek-OCR

Skill by ara.so — Daily 2026 Skills collection.

DeepSeek-OCR is a vision-language model for Optical Character Recognition with "Contexts Optical Compression." It supports native and dynamic resolutions, multiple prompt modes (document-to-markdown, free OCR, figure parsing, grounding), and can be run via vLLM (high-throughput) or HuggingFace Transformers. It processes images and PDFs, outputting structured text or markdown.


Installation

Prerequisites

  • CUDA 11.8+, PyTorch 2.6.0
  • Python 3.12.9 (via conda recommended)

Setup

git clone https://github.com/deepseek-ai/DeepSeek-OCR.git
cd DeepSeek-OCR
Related skills
Installs
1.1K
GitHub Stars
4
First Seen
Mar 21, 2026