ez-stt
Installation
SKILL.md
ez-stt - Local Speech-to-Text
Unified local speech-to-text using ONNX Runtime with int8 quantization. Choose your backend:
- Parakeet (default): Best accuracy for English, correctly captures names and filler words
- Whisper: Fastest inference, supports 99 languages
Requires ffmpeg installed.
Usage
# Default: Parakeet v2 (best English accuracy)
scripts/stt.py audio.ogg
# Explicit backend selection
scripts/stt.py audio.ogg -b whisper
scripts/stt.py audio.ogg -b parakeet -m v3