ai-avatar-video
Generate talking head and avatar videos from images and audio using OmniHuman, Fabric, and PixVerse models.
- Four model options: OmniHuman 1.5 (multi-character), OmniHuman 1.0 (single character), Fabric 1.0 (image lipsync), and PixVerse Lipsync (highly realistic)
- Audio-driven workflow: pair portrait images with speech files to generate realistic avatar videos with synchronized lip movement
- Composable with text-to-speech and video transcription for end-to-end pipelines: generate speech, create avatar video, or dub existing videos in new languages
- Requires inference.sh CLI (
infsh) and supports bash command execution through the skill interface
AI Avatar & Talking Head Videos
Create AI avatars and talking head videos via inference.sh CLI.

Quick Start
Requires inference.sh CLI (
belt). Install instructions
belt login
More from inferen-sh/skills
elevenlabs-voice-isolator
0remotion-render
Render videos from React/Remotion component code via inference.sh. Pass TSX code, get MP4. Supports all Remotion APIs: useCurrentFrame, useVideoConfig, spring, interpolate, AbsoluteFill, Sequence. Configurable resolution, FPS, duration, codec. Use for: programmatic video generation, animated graphics, motion design, data-driven videos, React animations to video. Triggers: remotion, render video from code, tsx to video, react video, programmatic video, remotion render, code to video, animated video, motion graphics code, react animation video
0elevenlabs-dubbing
0ai-voice-cloning
0widgets-ui
0tools-ui
0