vision-language-models

Installation
SKILL.md

Vision Language Models (2026)

Integrate vision capabilities from leading multimodal models for image understanding, document analysis, and visual reasoning.

Overview

  • Image captioning and description generation
  • Visual question answering (VQA)
  • Document/chart/diagram analysis with OCR
  • Multi-image comparison and reasoning
  • Bounding box detection and region analysis
  • Video frame analysis

Model Comparison (January 2026)

Installs
4
GitHub Stars
193
First Seen
Jan 21, 2026
vision-language-models — yonatangross/skillforge-claude-plugin