music-to-video
music-to-video — one music-grounded, beat-synced video workflow
Use this skill to turn a music track into a beat-synced HyperFrames video. You analyze the track once, lay out the frames, fill in a per-frame plan, and build each frame as a composition. The input is a music track plus optional user images or videos — there is no narration and no website capture. Typography and templates are the floor (a complete video needs zero assets); any media the user supplies is cut in on the same beat grid.
You are the orchestrator. Work in videos/<project>/. Run the steps in order and pass each Gate before moving on. Two steps need the user: Step 3 (plan approval) and Step 6 (render approval). Do every step yourself except Step 4, where you dispatch one sub-agent per frame. Keep design and motion rules out of this file — they live in references/ and the frame-worker sub-agent.
SKILL_DIR = this skill directory. PROJECT_DIR = videos/<project-name>/.
Workflow: Step 0 setup → hyperframes.json + assets/bgm.mp3; Step 1 analyze → audiomap.json; Step 2 skeleton → STORYBOARD.md (frames, groups TBD); Step 3 plan → complete STORYBOARD.md + frame.md; Step 4 build → compositions/frames/NN-*.html; Step 5 assemble → index.html; Step 6 render → renders/video.mp4.
Two ideas that shape everything
- One analyzer, and you trust it.
analyze-beatgrid.pyis the only beat analyzer — never re-measure beats with another tool or by ear. Its energy / density / rolls / onsets / silences are always reliable. Itsbpmandbeats_secare reliable only when the music is genuinely rhythmic; on calm music the grid is a metronome the tracker imposed, so pace by phrases and energy instead and never hard-cut to it. Deciding which case you're in is each frame'spacing(Step 2). - One frame = one file; groups live inside. Step 2 cuts the track into frames, and each frame becomes one composition file
compositions/frames/NN-<frame_id>.html, built by one frame-worker. A frame can subdivide into groups (each a template or a motion-primitives combo). Extra density goes inside a group, so frame count tracks distinct treatments, not beats — a fast track does not blow up the number of sub-agents.