speech-use
Speech Use
Use this skill to perform Text-to-Speech (TTS), Speech-to-Text (STT), and Voice Cloning operations.
This skill uses portable Python scripts managed by uv.
Prerequisites
-
Environment Variables:
GOOGLE_API_KEY(for TTS via Gemini)GOOGLE_CLOUD_PROJECT(Required for STT and Voice Cloning)GOOGLE_APPLICATION_CREDENTIALS(Recommended for STT/Voice Cloning)
-
APIs Enabled:
- Text-to-Speech API (
texttospeech.googleapis.com) - Speech-to-Text API (
speech.googleapis.com)
- Text-to-Speech API (
Usage
More from cnemri/google-genai-skills
google-adk-python
Expert guidance on the Google Agent Development Kit (ADK) for Python. Use this skill when the user asks about building agents, using tools, streaming, callbacks, tutorials, deployment, or advanced architecture with the Google ADK in Python.
281veo-use
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension. Available parameters: prompt, image, mask, mode, duration, aspect-ratio. Always confirm parameters with the user or explicitly state defaults before running.
131google-genai-sdk-python
Expert guidance for writing Python code using the official Google GenAI SDK (google-genai) for Gemini API and Vertex AI. Use for text generation, multimodal inputs, reasoning, tools, and media generation.
82veo-build
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Inpainting, and Advanced Controls.
71deep-research
Perform autonomous, multi-step research using the Gemini Deep Research Agent (Interactions API). Supports web search, file/directory context, and resilient streaming.
62nano-banana-use
Generate, edit, and compose images using Gemini Nano Banana models via portable Python scripts. Handles authentication via API Key or Vertex AI environment variables. Available parameters: prompt, model, aspect-ratio, safety-filter-level. Always confirm parameters with the user or explicitly state defaults before running.
55