image-generation
Image Generation with Gemini
Use this skill when the user asks to generate or edit images with Gemini using the Python SDK. Default to gemini-3-pro-image-preview, and mention gemini-2.5-flash-image only as an optional faster/cheaper alternative.
Workflow
- Identify task type (text-to-image, edit, or multi-reference).
- Ensure
GEMINI_API_KEYis available (env or stored in.env), then use the Python SDK. This will make network requests to the Gemini API - Choose model + output (
response_modalities=["IMAGE"]if image-only) and run. Generation can take ~30 seconds; allow 30–60 seconds before retrying. - Save returned images with
part.as_image(); if none, report a clear error.
Use these references
references/python.mdfor Python SDK usage
Response handling (Python SDK)
Use part.as_image() to access image outputs and save them. If no image parts are returned, surface a clear error and suggest checking the API key, model name, and response modalities.
More from xiangyu-cas/vision-skills
video-generation
Gemini video generation with Veo 3.1 via the Python SDK. Use when generating videos from text or images, using reference images, first/last frame interpolation, or video extension, and when tuning Veo parameters (aspect ratio, resolution, duration, negative prompts, personGeneration, seed).
10bbdown-cli
Install and use the BBDown CLI on Linux/macOS for Bilibili downloads, including login/cookies/access_token, downloading by URL, preferring 720p when available, and writing output under a local data/ directory.
2