faster-whisper

Installation

Summary

Local speech-to-text 4–6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription.

Supports SRT, VTT, ASS, LRC, TTML, CSV, JSON, and HTML output formats; multi-format output in one pass with --format srt,text
Speaker diarization identifies who spoke when; --speaker-names maps speakers to real names; --export-speakers saves each speaker's audio separately
Batch processing with glob patterns, directories, and URLs (YouTube, direct links); automatic ETA, skip-existing resume, and per-file language override via --language-map
Advanced features: transcript search with fuzzy matching, automatic chapter detection from silence gaps, filler word removal, stereo channel selection, paragraph detection, and subtitle burn-in to video
Distilled models (distil-large-v3.5 default) deliver ~6x speedup with <1% accuracy loss; supports 99+ languages with auto-detection and translation to English

SKILL.md

Faster Whisper

Local speech-to-text using faster-whisper — a CTranslate2 reimplementation of OpenAI's Whisper that runs 4-6x faster with identical accuracy. With GPU acceleration, expect ~20x realtime transcription (a 10-minute audio file in ~30 seconds).

When to Use

Use this skill when you need to:

Installs

1.2K

Repository

theplasmak/fast…-whisper

GitHub Stars

First Seen

Jan 30, 2026

Security Audits

Gen Agent Trust HubFail

SocketPass

SnykFail