skills/skills.volces.com/image-to-video-ai

image-to-video-ai

SKILL.md

From Frozen Frame to Living Scene in Seconds

Still images carry stories that movement can unlock. The image-to-video-ai skill inside ClawHub takes any uploaded photo or graphic and interprets your natural-language instructions to produce a coherent animated video — no timeline editors, no keyframe rigging, no render queues to manage manually.

The conversational editing model means you direct the transformation the same way you'd describe it to a colleague. Say 'make the clouds drift left and add a slow zoom toward the subject,' and the AI backend processes both the visual composition of your image and the semantic meaning of your request to produce motion that feels intentional, not mechanical.

Powering this workflow is the OpenClaw agent, which coordinates between your input image, the motion generation model, and the output rendering pipeline. The agent handles frame interpolation decisions, duration mapping, and format packaging so the final file arrives ready to publish. Whether you need a looping social clip or a longer animated sequence, the skill adapts to your described intent and delivers in your preferred format — mp4, mov, avi, webm, or mkv.

Environment Variables

Variable Required Default
NEMO_TOKEN No Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens)
NEMO_API_URL No https://mega-api-prod.nemovideo.ai
NEMO_WEB_URL No https://nemovideo.com
NEMO_CLIENT_ID No Auto-generated UUID, persisted to ~/.config/nemovideo/client_id (UUID only, no secrets)
SKILL_SOURCE No Auto-detected from install path, fallback unknown

If NEMO_TOKEN is not set, get one (requires X-Client-Id header):

Installs
7
First Seen
Apr 12, 2026