llm-serving-auto-benchmark

Installation

SKILL.md

LLM Serving Auto Benchmark

Use this skill to compare LLM serving frameworks such as SGLang, vLLM, and TensorRT-LLM for the same model and workload.

Use a config-driven workflow:

keep launch-only capacity choices in each framework's base_server_flags
put the search knobs in search_space
run the same dataset scenarios for every framework
generate a bounded candidate list from search_space, with the baseline candidate included first
keep failed candidates in the result file
pick the best SLA-passing candidate after normalizing the results

For model-specific starting points, prefer the shipped configs in configs/cookbook-llm/. They define a framework-neutral LLM serving cookbook

Related skills