tts

Installation
Summary

Text-to-speech with dual backends, voice cloning, and timeline-accurate audio synthesis for dubbing and video narration.

  • Supports two backends: Kokoro (local, offline) for simple speech synthesis, and Noiz (cloud) for voice cloning, emotion control, and precise segment timing
  • Simple mode converts text, files, or URLs to audio with optional voice cloning from reference audio; timeline mode aligns speech to SRT subtitles with per-segment voice and emotion control
  • Voice maps enable granular control over voice selection, language, speed, and emotion across individual segments or ranges
  • Guest mode provides 15 built-in voices without API authentication; full features require a Noiz API key
  • Supports multiple output formats (WAV, MP3, Opus, OGG) and integrates with Feishu, Telegram, and Discord
SKILL.md

tts

Convert any text into speech audio. Supports two backends (Kokoro local, Noiz cloud), two modes (simple or timeline-accurate), and per-segment voice control.

Triggers

  • text to speech / tts / speak / say
  • voice clone / dubbing
  • epub to audio / srt to audio / convert to audio
  • 语音 / 说 / 讲 / 说话

Simple Mode — text to audio

speak is the default — the subcommand can be omitted:

Related skills

More from noizai/skills

Installs
3.6K
Repository
noizai/skills
GitHub Stars
497
First Seen
Feb 28, 2026