imagegen
Generate or edit images via OpenAI's API with a bundled CLI for deterministic, reproducible runs.
- Supports three workflows: generate new images, edit existing images (inpainting, masking, background replacement, object removal), and batch runs across multiple prompts or variants
- Defaults to
gpt-image-1.5and requiresOPENAI_API_KEYfor live API calls; uses the bundledscripts/image_gen.pyCLI for all operations - Includes a structured decision tree and prompt augmentation template to classify requests (e.g., product mockup, UI mockup, logo design, photorealistic scenes) and map them to sensible defaults
- Handles multi-image edits with explicit invariants, temporary JSONL batching under
tmp/imagegen/, and output organization underoutput/imagegen/
Image Generation Skill
Generates or edits images for the current project (for example website assets, game assets, UI mockups, product mockups, wireframes, logo design, photorealistic images, or infographics).
Top-level modes and rules
This skill has exactly two top-level modes:
- Default built-in tool mode (preferred): built-in
image_gentool for normal image generation and editing. Does not requireOPENAI_API_KEY. - Fallback CLI mode (explicit-only):
scripts/image_gen.pyCLI. Use only when the user explicitly asks for the CLI path. RequiresOPENAI_API_KEY.
Within the explicit CLI fallback only, the CLI exposes three subcommands:
generateeditgenerate-batch
Rules:
- Use the built-in
image_gentool by default for all normal image generation and editing requests.
More from openai/skills
screenshot
Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed.
2.7Ksecurity-best-practices
Perform language and framework specific security best-practice reviews and suggest improvements. Trigger only when the user explicitly requests security best practices guidance, a security review/report, or secure-by-default coding help. Trigger only for supported languages (python, javascript/typescript, go). Do not trigger for general code review, debugging, or non-security tasks.
2.5Kfigma
Use the Figma MCP server to fetch design context, screenshots, variables, and assets from Figma, and to translate Figma nodes into production code. Trigger when a task involves Figma URLs, node IDs, design-to-code implementation, or Figma MCP setup and troubleshooting.
2.4Kplaywright
Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via `playwright-cli` or the bundled wrapper script.
2.4Kpdf
Use when tasks involve reading, creating, or reviewing PDF files where rendering and layout matter; prefer visual checks by rendering pages (Poppler) and use Python tools such as `reportlab`, `pdfplumber`, and `pypdf` for generation and extraction.
2.3Kfigma-implement-design
Translates Figma designs into production-ready application code with 1:1 visual fidelity. Use when implementing UI code from Figma files, when user mentions "implement design", "generate code", "implement component", provides Figma URLs, or asks to build components matching Figma specs. For Figma canvas writes via `use_figma`, use `figma-use`.
2.2K