rag-perf
Installation
SKILL.md
RAG-Perf — config-driven perf benchmark CLI
Purpose
Drive a deployed NVIDIA RAG Blueprint server with a YAML config, run a server-side profiling pass (per-stage timing, citation quality, bottleneck inference) and an optional aiperf load test (TTFT / E2E / token & request throughput / error rate), and write a unified report. The CLI is intentionally minimal: rag-perf -c <config> plus --help / --version. Behaviour is fully config-driven; field variations belong in YAML.
Scope
- Accuracy / RAGAS scoring of answer quality → use the rag-eval skill.
- Deploying, repairing, or configuring services (compose, helm, NIM env vars) → use the rag-blueprint skill.
- Production monitoring / alerting — rag-perf is a one-shot benchmark tool.
- Runtime requirement: a deployed RAG server reachable on the network.