sglang-deepseek-v32-optimization

Installation
SKILL.md

SGLang DeepSeek V3.2 Optimization

Overview

This skill covers the DeepSeek V3.2 support and optimization ladder active in SGLang main. V3.2 shares the DeepSeek V3/R1 model backbone, but it is a separate optimization problem because it activates DeepSeek Sparse Attention, called DSA in docs and NSA in SGLang code.

Current-main snapshot:

  • SGLang origin/main: 929e00eea on 2026-04-21
  • sgl-cookbook origin/main: 8ec4d03 on 2026-04-21
  • V3.2 runtime entry: DeepseekV32ForCausalLM in python/sglang/srt/models/deepseek_v2.py
  • NSA backend: python/sglang/srt/layers/attention/nsa_backend.py
  • NSA indexer: python/sglang/srt/layers/attention/nsa/nsa_indexer.py
  • V3.2 tool parser: python/sglang/srt/function_call/deepseekv32_detector.py

The historical evidence lives in:

Related skills

More from bbuf/sglang-auto-driven-skills

Installs
1
GitHub Stars
272
First Seen
Apr 23, 2026