ai-understanding-images
Installation
SKILL.md
AI Understanding Images
Use DSPy's dspy.Image type to pass images into signatures alongside text. Vision LLMs return structured data from photos, screenshots, documents, and charts.
Step 1 - Understand the image task
Before writing code, ask:
- What images will you process? (URLs, local files, base64, cloud storage?)
- What do you need to extract? (text, categories, attributes, descriptions?)
- Does the output need to be structured? (typed fields vs. free text?)
- Are you processing images in batch or one at a time?
- Does the task require reasoning about the image, or just direct extraction?