ez-stt

Installation
SKILL.md

ez-stt - Local Speech-to-Text

Unified local speech-to-text using ONNX Runtime with int8 quantization. Choose your backend:

  • Parakeet (default): Best accuracy for English, correctly captures names and filler words
  • Whisper: Fastest inference, supports 99 languages

Requires ffmpeg installed.

Usage

# Default: Parakeet v2 (best English accuracy)
scripts/stt.py audio.ogg

# Explicit backend selection
scripts/stt.py audio.ogg -b whisper
scripts/stt.py audio.ogg -b parakeet -m v3
Installs
18
Repository
araa47/ez-voice
GitHub Stars
1
First Seen
Mar 24, 2026
ez-stt — araa47/ez-voice