nemo-mbridge-perf-expert-parallel-overlap

Installation
SKILL.md

MoE Expert-Parallel Overlap Skill

References

  • Stable docs: @docs/training/communication-overlap.md
  • Structured metadata: @skills/nemo-mbridge-perf-expert-parallel-overlap/card.yaml

What It Is

Expert-parallel (EP) overlap hides the cost of token dispatch/combine all-to-all communication by running it concurrently with expert FFN compute. Optionally, delayed expert weight-gradient computation (delay_wgrad_compute) provides additional overlap by deferring wgrad to overlap with the next layer's forward.

Bridge supports two dispatcher paths:

Installs
134
Repository
nvidia/skills
GitHub Stars
1.0K
First Seen
7 days ago
nemo-mbridge-perf-expert-parallel-overlap — nvidia/skills