image-with-comfyui
SKILL.md
Image with ComfyUI
Call a local ComfyUI server to generate or edit images and videos. Four modes:
- T2I (Text → Image) → Z-Image or SD3.5 Medium model
- I2I (Image → Image / Edit) → Qwen Image Edit model
- I2V (Image → Video) → Wan2.2 model
When to Use
- User asks to generate images from text (Chinese: 绘图/生图/画图/生成图片)
- User asks to edit an image (Chinese: 修图/改图/编辑图片/换装/换背景)
- User asks to generate a video from an image + text (Chinese: 图生视频/动画化/生成视频)
- User provides a description and wants visual output
Image-First Conversational Pattern (Image-First Mode)
Detection rules:
- User sends only an image (no text, no other message in the same turn)