Accuracy + Performance Test

Start vLLM serve with the target model, run accuracy benchmarks (when FlagEval is available) and performance benchmarks (vllm bench serve) across multiple profiles.

Skill Components

perf-test/
├── SKILL.md                            # This file — execution flow
├── scripts/
│   ├── run_benchmark.py                # Run single benchmark profile (JSON output)
│   └── run_all_benchmarks.py           # Run all 5 profiles, collect + summarize (JSON)
└── references/
    └── benchmark-profiles.md           # Profile definitions, metrics, vllm bench usage

Reused from env-verify:

env-verify/scripts/test_serve_mode.py — can be used to verify server is healthy

Related skills

perf-test

Accuracy + Performance Test

Skill Components

More from flagos-ai/skills

kernelgen-flagos

model-migrate-flagos

tle-developer-flagos

gpu-container-setup-flagos

skill-creator-flagos

flagrelease-entrance-flagos