npu-smi
npu-smi Command Reference
Quick reference for Huawei Ascend NPU device management commands.
Validated on real host: Ascend 910B3 server (
npu-smisoftware version25.5.1) on 2026-03-27. Command availability and output fields can differ by platform / firmware.
Quick Start
npu-smi info -l # List all devices
npu-smi info -t health -i 0 # Check device health
npu-smi info -t temp -i 0 -c 0 # Check temperature
npu-smi info -t power -i 0 -c 0 # Check power
npu-smi info -t memory -i 0 -c 0 # Check memory
Device Queries
Basic Information
More from ascend-ai-coding/awesome-ascend-skills
atc-model-converter
Complete toolkit for Huawei Ascend NPU model conversion and end-to-end inference adaptation. Workflow 1 auto-discovers input shapes and parameters from user source code. Workflow 2 exports PyTorch models to ONNX. Workflow 3 converts ONNX to .om via ATC with multi-CANN version support. Workflow 4 adapts the user's full inference pipeline (preprocessing + model + postprocessing) to run end-to-end on NPU. Workflow 5 verifies precision between ONNX and OM outputs. Workflow 6 generates a reproducible README. Supports any standard PyTorch/ONNX model. Use when converting, testing, or deploying models on Ascend AI processors.
55hccl-test
HCCL (Huawei Collective Communication Library) performance testing for Ascend NPU clusters. Use for testing distributed communication bandwidth, verifying HCCL functionality, and benchmarking collective operations like AllReduce, AllGather. Covers MPI installation, multi-node pre-flight checks (SSH/CANN version/NPU health), and production testing workflows.
54ascendc
AscendC transformer/GMM/MoE 算子与 Matmul/Cube Kernel 的统一开发规范。用于在 ops-transformer 下新增或修改 op_host、tiling/infershape、op_kernel(含 MatmulImpl/Cube 调用)、以及对应的 CANN aclnn 示例和单测。
51ascend-docker
Create Docker containers for Huawei Ascend NPU development with proper device mappings and volume mounts. Use when setting up Ascend development environments in Docker, running CANN applications in containers, or creating isolated NPU development workspaces. Supports privileged mode (default), basic mode, and full mode with profiling/logging. Auto-detects available NPU devices.
51msmodelslim
Huawei Ascend NPU model compression tool (msModelSlim). Use for LLM quantization (W4A8, W8A8, W8A8S, W8A16), MoE model compression, multimodal model compression (Qwen-VL, InternVL, HunyuanVideo, FLUX, SD3), calibration data preparation, precision auto-tuning, sensitive layer analysis, custom model integration, and deployment in MindIE/vLLM-Ascend. Supports Qwen, LLaMA, DeepSeek, GLM, Kimi, InternLM and more.
44vllm-ascend
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.
41