ai-video-generation
Generate videos with 40+ AI models including Veo, Seedance, Wan, and Grok via inference.sh CLI.
- Supports text-to-video, image-to-video, avatar animation, lipsync, video upscaling, and foley sound generation across multiple model families
- Access 10+ text-to-video models (Veo 3.1, Seedance 1.5 Pro, Wan, Grok Video) and 5+ image-to-video variants optimized for speed, quality, or cost
- Includes avatar and lipsync tools (OmniHuman, Fabric, PixVerse) for talking-head and character animation workflows
- Utility models for video upscaling, sound effect generation, and multi-clip merging with transitions
Install the belt CLI skill:
npx skills add belt-sh/cli
AI Video Generation
Generate videos with 40+ AI models via inference.sh CLI.

Quick Start
Requires inference.sh CLI (
belt). Install instructions
belt login
# Generate a video with Veo
belt app run google/veo-3-1-fast --input '{"prompt": "drone shot flying over a forest"}'
More from inferen-sh/skills
ai-marketing-videos
Create AI marketing videos for ads, promos, product launches, and brand content. Models: Veo, Seedance, Wan, FLUX for visuals, Kokoro for voiceover. Types: product demos, testimonials, explainers, social ads, brand videos. Use for: Facebook ads, YouTube ads, product launches, brand awareness. Triggers: marketing video, ad video, promo video, commercial, brand video, product video, explainer video, ad creative, video ad, facebook ad video, youtube ad, instagram ad, tiktok ad, promotional video, launch video
0text-to-speech
Convert text to natural speech with Inworld TTS, ElevenLabs, DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: Inworld TTS-2 (100+ languages, emotion steering), Inworld TTS 1.5 (ultra-low latency), ElevenLabs (premium, 22+ voices, 32 languages), DIA TTS (conversational), Kokoro TTS, Chatterbox, Higgs Audio, VibeVoice (podcasts). Capabilities: text-to-speech, voice cloning, multi-speaker dialogue, podcast generation, expressive speech, emotion/delivery steering, character voices. Use for: voiceovers, audiobooks, podcasts, accessibility, video narration, IVR, voice assistants, gaming characters, avatar audio. Triggers: text to speech, tts, voice generation, ai voice, speech synthesis, voice over, generate speech, ai narrator, voice cloning, text to audio, elevenlabs, eleven labs, voice ai, ai voiceover, speech generator, natural voice, inworld, inworld tts, character voice, game voice, npc voice
0ai-podcast-creation
0competitor-teardown
0javascript-sdk
0data-visualization
0