speech-engine

Installation
SKILL.md

ElevenLabs Speech Engine

Add a real-time voice interface to a custom agent. ElevenLabs handles microphone audio, speech-to-text, turn-taking, text-to-speech, and browser playback; your server exposes a Speech Engine WebSocket endpoint and streams response text back.

Setup: See Installation Guide. For JavaScript, use @elevenlabs/* packages only. For deeper SDK details, read JavaScript SDK Reference or Python SDK Reference.

When to Use

Use Speech Engine when the user wants to:

  • Add voice to an existing chat app or custom server pipeline
  • Add voice to OpenClaw, Hermes, or a similar agent runtime while keeping agent logic on the developer-owned server
  • Build a developer-hosted WebSocket server for ElevenLabs voice conversations
  • Stream response text back as spoken audio after your server validates user intent
  • Handle user interruptions while a response is still streaming
  • Build a browser client with @elevenlabs/react or @elevenlabs/client using a server-issued conversation token

Use the agents skill instead when the user is creating or configuring a hosted ElevenLabs Conversational AI agent with platform-managed prompts, tools, workflows, phone numbers, or widgets.

Related skills

More from elevenlabs/skills

Installs
351
GitHub Stars
250
First Seen
12 days ago