llm-serving-auto-benchmark

Installation
SKILL.md

LLM Serving Auto Benchmark

Overview

Use this skill to compare LLM serving frameworks such as SGLang, vLLM, and TensorRT-LLM for the same model and workload.

Use a config-driven workflow:

  • keep launch-only capacity choices in each framework's base_server_flags
  • put the search knobs in search_space
  • run the same dataset scenarios for every framework
  • generate a bounded candidate list from search_space, with the baseline candidate included first
  • keep failed candidates in the result file
  • pick the best SLA-passing candidate after normalizing the results

For model-specific starting points, prefer the shipped configs in configs/cookbook-llm/. They define a framework-neutral LLM serving cookbook

Related skills

More from bbuf/sglang-auto-driven-skills

Installs
19
GitHub Stars
272
First Seen
Apr 23, 2026