SGLang MiniMax M2 Series Optimization

Overview

The skill covers the full MiniMax optimization ladder: mainline history, the remaining still-open upstream PR track, and current-main validation lanes. Use it to recover, extend, or audit MiniMax-specific optimizations, or to reuse the patterns on a structurally similar MoE model.

As of 2026-04-21, refreshed against SGLang origin/main commit c122d343a, the MiniMax story is split across three sources of truth:

mainline history already present in main
still-open upstream PRs that are important for MiniMax-M2.5, but not fully landed in main yet
current registered docs/tests/workflows, especially the MiniMax-M2.7 AMD accuracy and performance lanes

This skill tracks all three, but it labels them clearly. Do not assume an optimization from a PR page is already in your local tree, and do not assume MiniMax-M2.7 or M2.7-highspeed is covered by MiniMax-M2.5 validation just because the same model file is used.

The historical evidence for every stage lives in:

references/pr-history.md: mainline and still-open PR evidence, benchmark notes, key code patterns
references/playbook.md: symptom mapping, commands, validation order

sglang-minimax-m2-series-optimization

SGLang MiniMax M2 Series Optimization

Overview

More from bbuf/sglang-auto-driven-skills

h100

h100-sglang-diffusion

sglang-prod-incident-triage

llm-serving-auto-benchmark

llm-torch-profiler-analysis

sglang-sota-performance