vllm-bench-serve

Installation

SKILL.md

vllm-bench-serve — Online Benchmark Orchestrator

1. Scope & Boundaries

This skill does:

Execute single or batch vllm bench serve online benchmarks against running inference services
Aggregate and compare results across multiple test cases
Auto-optimize: search for optimal concurrency/throughput given SLO constraints

This skill does NOT do (defer to other skills or decline):

Start/deploy vLLM services → use vllm-ascend-server
Offline batch inference throughput tests → use vllm-ascend
Profiling / tracing (torch profiler, perfetto, NPU profiling) → out of scope
Health checks only (just checking /v1/models) → simple curl, no skill needed
Download/clean/convert datasets without running benchmarks → out of scope
Analyze existing benchmark results without running new tests → out of scope

Installs

18

Repository

ascend-ai-codin…d-skills

GitHub Stars

143

First Seen

Apr 18, 2026

Security Audits

Gen Agent Trust HubWarn

vllm-bench-serve — ascend-ai-coding/awesome-ascend-skills