mindspeed-mm-env-setup
Installation
SKILL.md
MindSpeed-MM Ascend NPU Base Environment Setup
This skill guides users through setting up the base environment for MindSpeed-MM multimodal training on Huawei Ascend NPU.
Important: This guide only covers the base environment. Different multimodal models (qwen3vl, wan2.2, hunyuanvideo, etc.) have vastly different dependency version requirements that may conflict with each other. Model-specific dependencies must be installed on top of the base environment. After completing this guide, refer to the corresponding model's SKILL for additional configuration.
Component Relationship
Megatron-LM (NVIDIA) <- Distributed training core (TP/PP), uses core_v0.12.1 branch
|
MindSpeed (Huawei) <- Ascend adaptation layer, monkey-patches Megatron kernels
|
MindSpeed-MM (Huawei) <- Multimodal application layer: VLM/generation/omni-modal training
MindSpeed-MM shares the underlying dependency stack (CANN, torch_npu, MindSpeed, Megatron-LM) with MindSpeed-LLM, but targets multimodal scenarios at the application level (vision-language models, video generation, speech synthesis, etc.).