transcribe-audio

Installation
SKILL.md

Skill: Transcribe Audio (parent brief)

Transcribes video audio using WhisperX and produces a clean JSON transcript with word-level timing.

SKILL.md is the parent's dispatch brief. The sub-agent's working prompt lives in agent_prompt.md — inline its contents when launching the Task agent. Don't pass SKILL.md.

Parallelism

Launch at most 2 in parallel. WhisperX is already multithreaded internally (~4 CPU threads via CTranslate2); 2 processes is the throughput-vs-RAM sweet spot on a 16GB Mac.

Inputs to gather and pass inline

The parent reads library.yaml and settings.yaml and passes these values inline in each agent's prompt:

  • video_path — absolute path to the video file
  • transcript_output_dir — where to write the transcript JSON (e.g. libraries/<library>/transcripts)
  • language_code — ISO 639-1 code (e.g. en, es) — parent maps from library.yaml's language name
  • whisper_model — model size from settings.yaml (e.g. small, medium, turbo)
  • transcript_refinement — boolean from library.yaml. If true, also pass:
Related skills

More from barefootford/buttercut

Installs
50
GitHub Stars
473
First Seen
Jan 28, 2026