video-cut
Installation
SKILL.md
video-cut — local-first raw-footage video editor
Drop raw clips in a folder → describe the cut → get edit/final.mp4. Fully local. The
editorial counterpart to our generative video skills (launch-video, Remotion,
cloned-voice-pitch-pipeline). Compounds on browser-use/video-use:
same two-layer reading architecture, but local ASR instead of cloud ElevenLabs Scribe.
Core principle — two-layer reading (never dump frames)
The agent reads video through two cheap layers, not by watching frames:
- Transcript layer —
transcribe_local.pyrunsfaster_whisperwith word-level timestamps (local, MPS/CPU). Packed intotakes_packed.md(~tens of KB) — the primary reading artifact. This is the routing projection. - Visual layer (on-demand) —
timeline_view.py <video> <start> <end>renders a filmstrip + waveform + word-label PNG only at decision points (ambiguous pauses, retake comparisons, cut-point checks). Never a scan — the body-grep expansion.