asr
Installation
Summary
Local offline audio transcription with multi-language support and optional AI polishing.
- Transcribes audio files to text using
coli asrwith no API keys required; supports Chinese, English, Japanese, Korean, and Cantonese via sensevoice model, or English-only via whisper-tiny - Models download automatically on first use (~60MB) to
~/.coli/models/; requirescoliCLI andffmpeg(WAV files work without it) - Optional AI polishing step corrects punctuation, removes filler words, and improves readability while preserving original meaning
- Exports transcripts as markdown files with metadata (source, date, model, duration, detected language)
SKILL.md
When to Use
- User wants to transcribe an audio file to text
- User provides an audio file path and asks for transcription
- User says "转录", "识别", "transcribe", "语音转文字"
When NOT to Use
- User wants to synthesize speech from text (use
/tts) - User wants to create a podcast or explainer (use
/podcastor/explainer)
Purpose
Transcribe audio files to text using coli asr, which runs fully offline via local
speech recognition models. No API key required. Supports Chinese, English, Japanese,
Korean, and Cantonese (sensevoice model) or English-only (whisper model).
Run coli asr --help for current CLI options and supported flags.
Related skills