whisper - Local Speech-to-Text & Subtitles

Local Processing: Emphasize that transcription happens locally without uploading data.
Model Selection: Allow users to choose from different model sizes (tiny, base, small, medium, large) for speed vs. accuracy.
File Integrity: Ensure input audio files are accessible.

The whisper module provides a high-performance local speech recognition capability using whisper.cpp. It handles everything from model management to video subtitle merging.