video-translation
Translate video speech into another language with AI-generated dubbing that preserves original timing and emotion.
- Downloads videos and subtitles, translates subtitle text, then generates dubbed audio using TTS with voice cloning matched to the original speaker's tone
- Automatically aligns dubbed audio duration to original subtitle timestamps and preserves background audio outside speech segments
- Requires
youtube-downloaderandttsskills as dependencies, plusffmpegand a Noiz API key for the TTS backend - Works with any video platform that provides subtitles; source language auto-detection supported but long videos may require significant processing time
Video Translation
Translate a video's speech into another language, using TTS to generate the dubbed audio and replacing the original audio track.
Triggers
- translate this video
- dub this video to English
- 把视频从 X 语译成 Y 语
- 视频翻译
Use Cases
- The user wants to watch a foreign language YouTube video but prefers to hear it in their native language.
- The user provides a video link and explicitly requests changing the audio language.
Workflow
When the user asks to translate a video:
More from noizai/skills
tts
Use this skill whenever the user wants to convert text into speech, generate audio from text, or produce voiceovers. Triggers include: any mention of 'TTS', 'text to speech', 'speak', 'say', 'voice', 'read aloud', 'audio narration', 'voiceover', 'dubbing', or requests to turn written content into spoken audio. Also use when converting EPUB/PDF/SRT/articles to audio, cloning voices from reference audio, controlling emotion or speed in speech, aligning speech to subtitle timelines, or producing per-segment voice-mapped audio.
3.6Kcharacteristic-voice
Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 'talk like', 'speak like', 'companion voice', 'comfort me', 'cheer me up', 'sound more human', 'good night voice', 'good morning voice', or requests to add fillers, emotion, or personality to generated speech. Also use when the user wants to mimic a specific character's voice, apply speaking style presets (goodnight, morning, comfort, celebration, chatting), tune emotional parameters like warmth or tenderness, or make TTS output feel like a real person talking. If the user asks for a 'voice message', 'companion audio', 'character voice', or wants speech that sighs, laughs, hesitates, or sounds genuinely warm, use this skill. Do NOT use for plain text-to-speech without personality, music generation, sound effects, or general coding tasks unrelated to expressive speech.
2.5Kchat-with-anyone
Chat with any real person or fictional character in their own voice by automatically finding their speech online, extracting a clean reference sample, and generating audio replies. Also supports generating a matching voice from an uploaded image. Use when the user says "我想跟xxx聊天", "你来扮演xxx跟我说话", "让xxx给我讲讲这篇文章", "我想跟图片中的人说话", or similar.
1.9Kdaily-news-caster
Fetches the latest news using news-aggregator-skill, formats it into a podcast script in Markdown format, and uses the tts skill to generate a podcast audio file. Use when the user asks to get the latest news and read it out as a podcast.
1.7Ktemplate-skill
Reusable template for authoring new Agent Skills with clear triggers, workflow, and I/O contracts.
1.4Ksound-fx
Use this skill whenever the user wants to generate sound effects, ambient audio, or short audio clips from a text description. Triggers include: any mention of 'sound effect', 'sfx', 'generate sound', 'make a sound', 'audio effect', 'ambient sound', 'foley', 'sound clip', 'noise', or requests to produce a specific sound (e.g. 'make a gunshot sound', 'generate thunder', 'create the sound of rain'). Also use when the user describes an action or scenario and wants the corresponding audio (e.g. 'someone getting spanked', 'a door slamming', 'cartoon boing'). Do NOT use for speech synthesis, music generation with melody/lyrics, or voice cloning.
146