deepgram-js-speech-to-text
Using Deepgram Speech-to-Text (JavaScript / TypeScript SDK)
Basic transcription for prerecorded audio (REST) or live audio (WebSocket) via /v1/listen.
When to use this product
- REST (
client.listen.v1.media.transcribeUrl/transcribeFile) — one-shot transcription of a finished URL or file. Good for batch jobs, caption generation, offline processing. - WebSocket (
client.listen.v1.createConnection()/connect()) — continuous streaming transcription. Good for live captions, microphone audio, telephony streams, browser or Node realtime apps.
Use a different skill when:
- You also want summaries, topics, intents, sentiment, language detection, or redaction guidance on the same
/v1/listencall →deepgram-js-audio-intelligence. - You need Flux turn-taking and end-of-turn events on
/v2/listen→deepgram-js-conversational-stt. - You need a full interactive assistant with STT + LLM + TTS over one socket →
deepgram-js-voice-agent.
Authentication
require("dotenv").config();
More from deepgram/deepgram-js-sdk
deepgram-js-conversational-stt
Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Conversational STT v2 / Flux (`/v2/listen`) for turn-aware streaming transcription. Covers `client.listen.v2.createConnection()` / `connect()`, Flux models, and turn events like `TurnInfo`. Use `deepgram-js-speech-to-text` for standard v1 ASR and `deepgram-js-voice-agent` for full-duplex assistants. Triggers include "flux", "v2 listen", "conversational STT", "turn detection", "end of turn", "EOT", and "listen.v2".
5deepgram-js-text-to-speech
Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Text-to-Speech v1 (`/v1/speak`) for audio synthesis. Covers one-shot REST via `client.speak.v1.audio.generate` and streaming WebSocket via `client.speak.v1.createConnection()` / `connect()`. Use `deepgram-js-voice-agent` when you need full-duplex STT + LLM + TTS instead of one-way synthesis. Triggers include "TTS", "text to speech", "speak", "aura", "streaming TTS", and "speak.v1".
4deepgram-js-maintaining-sdk
Use when regenerating this JavaScript/TypeScript SDK with Fern, editing `.fernignore`, preparing the repo for a generator release, reconciling hand-maintained files after regen, or deciding whether a file should be permanently frozen vs temporarily frozen. This SDK is Fern-generated - most files under `src/` should not be edited directly. Triggers include "fern regen", "regenerate SDK", ".fernignore", "unfreeze", "re-apply patches", "SDK regeneration", and "freeze classification".
4deepgram-js-management-api
Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Management APIs for projects, API keys, members, invites, requests, usage, billing, models, and agent think-model discovery. Covers `client.manage.v1.*` plus `client.agent.v1.settings.think.models.list()`. Use `deepgram-js-voice-agent` when you want to run an agent live rather than administer projects or inspect models. Triggers include "management API", "list projects", "API keys", "members", "invites", "usage stats", "billing", "list models", and "manage.v1".
4deepgram-js-audio-intelligence
Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram audio analytics overlays on `/v1/listen` - summarize, topics, intents, sentiment, diarize, redact, detect_language, and entity detection. Same endpoint as plain STT, different params. Covers REST via `client.listen.v1.media.transcribeUrl` / `transcribeFile` and the WebSocket-supported subset on `client.listen.v1.createConnection()` / `connect()`. Use `deepgram-js-speech-to-text` for plain transcription and `deepgram-js-text-intelligence` for analytics on already-transcribed text. Triggers include "audio intelligence", "summarize audio", "diarize", "sentiment from audio", "redact PII", and "detect language audio".
3deepgram-js-voice-agent
Use when writing or reviewing JavaScript/TypeScript in this repo that builds an interactive voice agent via `agent.deepgram.com/v1/agent/converse`. Covers `client.agent.v1.createConnection()` / `connect()`, `sendSettings`, `sendMedia`, runtime updates, event handling, and function-call responses. Use `deepgram-js-text-to-speech` for one-way synthesis, `deepgram-js-speech-to-text` or `deepgram-js-conversational-stt` for transcription only, and `deepgram-js-management-api` for project/model admin rather than live agent runtime. Triggers include "voice agent", "agent converse", "full duplex", "barge-in", "function calling", and "agent.v1".
2