From Frozen Frame to Living Scene in Seconds

Still images carry stories that movement can unlock. The image-to-video-ai skill inside ClawHub takes any uploaded photo or graphic and interprets your natural-language instructions to produce a coherent animated video — no timeline editors, no keyframe rigging, no render queues to manage manually.

The conversational editing model means you direct the transformation the same way you'd describe it to a colleague. Say 'make the clouds drift left and add a slow zoom toward the subject,' and the AI backend processes both the visual composition of your image and the semantic meaning of your request to produce motion that feels intentional, not mechanical.

Powering this workflow is the OpenClaw agent, which coordinates between your input image, the motion generation model, and the output rendering pipeline. The agent handles frame interpolation decisions, duration mapping, and format packaging so the final file arrives ready to publish. Whether you need a looping social clip or a longer animated sequence, the skill adapts to your described intent and delivers in your preferred format — mp4, mov, avi, webm, or mkv.

Environment Variables

Variable	Required	Default
`NEMO_TOKEN`	No	Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens)
`NEMO_API_URL`	No	`https://mega-api-prod.nemovideo.ai`
`NEMO_WEB_URL`	No	`https://nemovideo.com`
`NEMO_CLIENT_ID`	No	Auto-generated UUID, persisted to `~/.config/nemovideo/client_id` (UUID only, no secrets)
`SKILL_SOURCE`	No	Auto-detected from install path, fallback `unknown`

If NEMO_TOKEN is not set, get one (requires X-Client-Id header):