minimax-multimodal-toolkit

Installation
SKILL.md

MiniMax Multi-Modal Toolkit

Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases (audio + music + video + image). Includes voice cloning & voice design for custom voices, image generation with character reference, and FFmpeg-based media tools for audio/video format conversion, concatenation, trimming, and extraction.

Output Directory

All generated files MUST be saved to minimax-output/ under the AGENT'S current working directory (NOT the skill directory). Every script call MUST include an explicit --output / -o argument pointing to this location. Never omit the output argument or rely on script defaults.

Rules:

  1. Before running any script, ensure minimax-output/ exists in the agent's working directory (create if needed: mkdir -p minimax-output)
  2. Always use absolute or relative paths from the agent's working directory: --output minimax-output/video.mp4
  3. Never cd into the skill directory to run scripts — run from the agent's working directory using the full script path
  4. Intermediate/temp files (segment audio, video segments, extracted frames) are automatically placed in minimax-output/tmp/. They can be cleaned up when no longer needed: rm -rf minimax-output/tmp

Prerequisites

Installs
3
Repository
x-cmd/skill
GitHub Stars
21
First Seen
Apr 10, 2026
minimax-multimodal-toolkit — x-cmd/skill