whisper
Installation
SKILL.md
whisper - Local Speech-to-Text & Subtitles
The whisper module provides a high-performance local speech recognition capability using whisper.cpp. It handles everything from model management to video subtitle merging.
When to Activate
- When the user wants to transcribe an audio file into text.
- When generating
.srtsubtitle files from audio/video. - When merging generated subtitles into a video file.
- When performing real-time speech-to-text using LiveKit or Streaming.
Core Principles & Rules
- Local Processing: Emphasize that transcription happens locally without uploading data.
- Model Selection: Allow users to choose from different model sizes (tiny, base, small, medium, large) for speed vs. accuracy.
- File Integrity: Ensure input audio files are accessible.
Additional Scenarios
- SRT Generation: Use
dictate --srtto create industry-standard subtitle files. - Video Integration: Use
mergeto embed subtitles into a video stream.