faster-whisper

Installation
SKILL.md

Faster Whisper

Local speech-to-text using faster-whisper — a CTranslate2 reimplementation of OpenAI's Whisper that runs 4-6x faster with identical accuracy. With GPU acceleration, expect ~20x realtime transcription (a 10-minute audio file in ~30 seconds).

When to Use

Use this skill when you need to:

  • Transcribe audio/video files — meetings, interviews, podcasts, lectures, YouTube videos
  • Convert speech to text locally — no API costs, works offline (after model download)
  • Batch process multiple audio files — efficient for large collections
  • Generate subtitles/captions — word-level timestamps available
  • Multilingual transcription — supports 99+ languages with auto-detection

Trigger phrases: "transcribe this audio", "convert speech to text", "what did they say", "make a transcript", "audio to text", "subtitle this video"

When NOT to use:

  • Real-time/streaming transcription (use streaming-optimized tools instead)
  • Cloud-only environments without local compute
  • Files <10 seconds where API call latency doesn't matter
Related skills

More from sundial-org/awesome-openclaw-skills

Installs
4
GitHub Stars
598
First Seen
Mar 24, 2026