skills/skills.volces.com/voice-clone-bot

voice-clone-bot

SKILL.md

Voice Clone Skill

A self-initializing, zero-configuration voice cloning skill. It manages a background TTS daemon that keeps heavy model weights in memory for fast inference. Supports multiple engines and unlimited text length.

Quick reference

Item Value
Entry script bash scripts/run_tts.sh --text "..." --ref_audio "..." [--speed 1.0] [--output_dir "..."]
Output Single line: absolute path to generated .ogg file
Attachment format MEDIA:<output_path>
Default engine F5-TTS (env TTS_BACKEND=f5)
Host/Port config .env (TTS_SERVER_HOST, TTS_SERVER_PORT)

When to use this skill

  • The user sends a voice memo or audio file and you need to reply with audio.
  • The user says "read this aloud", "speak to me", "use my voice", "voice mode".
  • The conversation context implies a spoken reply is expected.
Installs
6
First Seen
Apr 4, 2026