vllm-bench-serve

Installation
SKILL.md

vllm-bench-serve — Online Benchmark Orchestrator

1. Scope & Boundaries

This skill does:

  • Execute single or batch vllm bench serve online benchmarks against running inference services
  • Aggregate and compare results across multiple test cases
  • Auto-optimize: search for optimal concurrency/throughput given SLO constraints

This skill does NOT do (defer to other skills or decline):

  • Start/deploy vLLM services → use vllm-ascend-server
  • Offline batch inference throughput tests → use vllm-ascend
  • Profiling / tracing (torch profiler, perfetto, NPU profiling) → out of scope
  • Health checks only (just checking /v1/models) → simple curl, no skill needed
  • Download/clean/convert datasets without running benchmarks → out of scope
  • Analyze existing benchmark results without running new tests → out of scope
Installs
9
GitHub Stars
93
First Seen
Apr 18, 2026
vllm-bench-serve — ascend-ai-coding/awesome-ascend-skills