speak-tts

Installation
Summary

Real-time text-to-speech with voice cloning on Apple Silicon, entirely on-device.

  • Supports multiple input sources (text files, markdown, stdin, web articles, PDFs) and output modes (streaming, file save, playback, or both)
  • Voice cloning from 10–30 second WAV samples at 24000 Hz mono; includes emotion tags like [laugh], [sigh], and [gasp] for audible effects
  • Batch processing with auto-chunking for long documents, concatenation utilities, and resume capability for interrupted generations
  • Requires Apple Silicon Mac, macOS 12.0+, and command-line tools (sox, ffmpeg, poppler); runs entirely locally via MLX with no API keys
SKILL.md

speak - Talk to your Claude!

Give your agent the ability to speak to you real-time. Local text-to-speech, voice cloning, and audio generation on Apple Silicon. Give your agent the ability to speak to you real-time. Local TTS with voice cloning on Apple Silicon.

Prerequisites

Requirement Check Install
Apple Silicon Mac uname -m → arm64 Intel not supported
macOS 12.0+ sw_vers -
sox which sox brew install sox
ffmpeg which ffmpeg brew install ffmpeg
poppler (PDF) which pdftotext brew install poppler

Input Sources

Installs
718
Repository
emzod/speak
GitHub Stars
6
First Seen
Jan 27, 2026