audio-transcriber

Installation
Summary

Transcribe audio files to structured Markdown with intelligent meeting minutes and executive summaries.

  • Supports MP3, WAV, M4A, OGG, FLAC, WEBM formats with automatic format detection and conversion via ffmpeg
  • Auto-detects and uses Faster-Whisper (4-5x faster) or OpenAI Whisper with zero configuration; offers one-click dependency installation
  • Extracts rich metadata (speakers, timestamps, language, duration, file size) and generates structured meeting minutes with topics, decisions, and action items
  • Optionally integrates with Claude or GitHub Copilot CLI for intelligent summarization and custom prompt-based processing
  • Handles batch processing of multiple files and warns on large files (>25 MB) with estimated processing time
SKILL.md

Purpose

This skill automates audio-to-text transcription with professional Markdown output, extracting rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.

Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.

When to Use

Invoke this skill when:

  • User needs to transcribe audio/video files to text
  • User wants meeting minutes automatically generated from recordings
  • User requires speaker identification (diarization) in conversations
  • User needs subtitles/captions (SRT, VTT formats)
  • User wants executive summaries of long audio content
  • User asks variations of "transcribe this audio", "convert audio to text", "generate meeting notes from recording"
  • User has audio files in common formats (MP3, WAV, M4A, OGG, FLAC, WEBM)

Workflow

Related skills
Installs
811
GitHub Stars
37.3K
First Seen
Feb 5, 2026