skills/smithery.ai/agentic-vision

agentic-vision

SKILL.md

Agentic Vision - The Sandwich Architecture

Version: 1.0.0 Last Updated: 2026-01-30


What is Agentic Vision?

Agentic Vision in Gemini 3 Flash converts image understanding from a static act into an agentic process. It combines visual reasoning with Code Execution.

Think → Act → Observe loop:
1. THINK: Analyze image, formulate plan
2. ACT: Generate and execute Python code (crop, measure, annotate)
3. OBSERVE: Process results, refine understanding

Key capability: Instead of "guessing" padding is p-4, it MEASURES and returns 24px.

Installs
2
First Seen
Mar 20, 2026