video-to-text

Installation

SKILL.md

Video to Text — 视频/音频转文字稿

依赖

ffmpeg: 从视频中提取音频（系统已安装）
whisperX: 语音识别 + 对齐 + 说话人分离（pip install whisperx）
HF_TOKEN: 说话人分离需要 HuggingFace token（环境变量 HF_TOKEN）

快速执行

对于简单的转写任务，直接运行脚本：

nohup python3 {skillDir}/scripts/transcribe.py /path/to/video.mp4 \
  --output-dir /path/to/output \
  --output-name transcript \
  --diarize \
  > /tmp/transcribe.log 2>&1 &

Related skills

More from yfge/video-skills-suite

Installs

1

Repository

yfge/video-skills-suite

GitHub Stars

1

First Seen

Mar 3, 2026

Security Audits

Gen Agent Trust HubPass