gemini-vision

Installation

SKILL.md

Gemini Vision API Skill

This skill enables Claude to use Google's Gemini API for advanced image understanding tasks including captioning, classification, visual question answering, object detection, segmentation, and multi-image analysis.

Quick Start

Prerequisites

Get API Key: Obtain from Google AI Studio
Install SDK: pip install google-genai (Python 3.9+)

If pip is not installed, instructs user to install it first.

API Key Configuration

The skill supports both Google AI Studio and Vertex AI endpoints.

Option 1: Google AI Studio (Default)

The skill checks for GEMINI_API_KEY in this order:

Related skills

More from aia-11-hn-mib/mib-mockinterviewaibot

gemini-video-understanding
Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).
25
imagemagick
Guide for using ImageMagick command-line tools to perform advanced image processing tasks including format conversion, resizing, cropping, effects, transformations, and batch operations. Use when manipulating images programmatically via shell commands.
14
remix-icon
Guide for implementing RemixIcon - an open-source neutral-style icon library with 3,100+ icons in outlined and filled styles. Use when adding icons to applications, building UI components, or designing interfaces. Supports webfonts, SVG, React, Vue, and direct integration.
8
obsidian-qa-saver
Save Q&A conversations to Obsidian notes with proper formatting, metadata, and organization. Use this skill when the user explicitly requests to save a conversation, question-answer exchange, or explanation to their Obsidian vault. Automatically formats content as document-style notes with timestamps, tags, and project links.
6
repomix
Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML, Markdown, plain text), preserve file structure and context, optimize for AI consumption with token counting, filter by file types and directories, add custom headers and summaries. Use when packaging codebases for AI analysis, creating repository snapshots for LLM context, analyzing third-party libraries, preparing for security audits, generating documentation context, or evaluating unfamiliar codebases.
5
sequential-thinking
Apply structured, reflective problem-solving for complex tasks requiring multi-step analysis, revision capability, and hypothesis verification. Use for complex problem decomposition, adaptive planning, analysis needing course correction, problems with unclear scope, multi-step solutions, and hypothesis-driven work.
5

Installs

Repository

aia-11-hn-mib/m…iewaibot

GitHub Stars

First Seen

Feb 20, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn

gemini-vision

Gemini Vision API Skill

Quick Start

Prerequisites

API Key Configuration

Option 1: Google AI Studio (Default)

More from aia-11-hn-mib/mib-mockinterviewaibot

gemini-video-understanding

imagemagick

remix-icon

obsidian-qa-saver

repomix

sequential-thinking