speech-engine
ElevenLabs Speech Engine
Add a real-time voice interface to a custom agent. ElevenLabs handles microphone audio, speech-to-text, turn-taking, text-to-speech, and browser playback; your server exposes a Speech Engine WebSocket endpoint and streams response text back.
Setup: See Installation Guide. For JavaScript, use
@elevenlabs/*packages only. For deeper SDK details, read JavaScript SDK Reference or Python SDK Reference.
When to Use
Use Speech Engine when the user wants to:
- Add voice to an existing chat app or custom server pipeline
- Add voice to OpenClaw, Hermes, or a similar agent runtime while keeping agent logic on the developer-owned server
- Build a developer-hosted WebSocket server for ElevenLabs voice conversations
- Stream response text back as spoken audio after your server validates user intent
- Handle user interruptions while a response is still streaming
- Build a browser client with
@elevenlabs/reactor@elevenlabs/clientusing a server-issued conversation token
Use the agents skill instead when the user is creating or configuring a hosted ElevenLabs Conversational AI agent with platform-managed prompts, tools, workflows, phone numbers, or widgets.
More from elevenlabs/skills
text-to-speech
Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.
5.4Kspeech-to-text
Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.
4.0Kagents
Build voice AI agents with ElevenLabs. Use when creating voice assistants, customer service bots, interactive voice characters, or any real-time voice conversation experience.
3.6Ksound-effects
Generate sound effects from text descriptions using ElevenLabs. Use when creating sound effects, generating audio textures, producing ambient sounds, cinematic impacts, UI sounds, or any audio that isn't speech. Supports looping, duration control, and prompt influence tuning.
3.0Kmusic
Generate music using ElevenLabs Music API. Use when creating instrumental tracks, songs with lyrics, background music, jingles, or any AI-generated music composition. Supports prompt-based generation, composition plans for granular control, and detailed output with metadata.
2.9Ksetup-api-key
Guides users through setting up an ElevenLabs API key for ElevenLabs MCP tools. Use when the user needs to configure an ElevenLabs API key, when ElevenLabs tools fail due to missing API key, or when the user mentions needing access to ElevenLabs. First checks whether ELEVENLABS_API_KEY is already configured and valid, and only runs full setup when needed.
2.9K