nemo-mbridge-perf-moe-hardware-configs
Installation
SKILL.md
MoE Hardware Configuration Reference
Stable docs: @docs/training/moe-optimization.md Card: @skills/nemo-mbridge-perf-moe-hardware-configs/card.yaml
Quick Platform Playbook
| Platform | Typical MoE strategy | What usually matters most |
|---|---|---|
| H100 | DeepEP + stronger PP + moderate TP | communication overlap and PP efficiency |
| B200 | DeepEP + MXFP8 + careful PP layout | container quality and tuned comm settings |
| GB200 | HybridEP + partial CUDA graphs + CPU cleanup | host overhead, topology-aware dispatch, memory headroom |
| GB300 | HybridEP + newer FP8 and kernel stack | same GB200 playbook, usually with a higher ceiling |
First Answer Checklist
For hardware playbook questions, answer from these canonical rows before adding throughput caveats: