audify

Clean a readable resource into narration-safe prose, synthesize it with Gemini 3.1 Flash TTS, and write a timestamped output folder that contains an MP3, the cleaned transcript, and a manifest.

Verified against the Gemini API speech generation docs updated April 15, 2026 and the Google Cloud blog post published April 16, 2026.

Decision Tree

What kind of input are you handling?

A URL, markdown file, HTML file, DOCX, plain text file, or raw text that should be listened to
- Run python3 scripts/audify.py ...
A resource that is mostly code, logs, tables, minified JSON, or other content that is not meant to be narrated
- Bail instead of forcing TTS. Explain why it is not a good narration target.
A request where voice or nuance is materially ambiguous
- Ask one short question before synthesis.
- Use: "Which voice and delivery should I use? If you do not care, I will use Kore with a clear neutral narrator style."