vllm-bench-random-synthetic

Installation
SKILL.md

vLLM Benchmark with Random Synthetic Data

Run a quick performance benchmark on a vLLM server using synthetic random data. This skill measures core serving metrics including request throughput, token throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and inter-token latency.

When to use

  • User wants to quickly benchmark vLLM serving performance
  • User wants to measure throughput and latency metrics without downloading datasets
  • User wants to test a vLLM deployment with synthetic workload
  • User wants baseline performance numbers for a specific model

Prerequisites

  • vLLM must be installed (pip install vllm)
  • A vLLM server must be running (or can be started as part of the benchmark)
  • For GPU models, NVIDIA GPU with appropriate drivers must be available

Quick Start

Related skills
Installs
38
GitHub Stars
68
First Seen
Apr 21, 2026