qa-agent-testing
Installation
SKILL.md
QA Agent Testing (Jan 2026)
Design and run reliable evaluation suites for LLM agents/personas, including tool-using and multi-agent systems.
Default QA Workflow
- Define the Persona Under Test (PUT): scope, out-of-scope, and safety boundaries.
- Define 10 representative tasks (Must Ace).
- Define 5 refusal edge cases (Must Decline + redirect).
- Define an output contract (format, tone, structure, citations).
- Run the suite with determinism controls and tool tracing.
- Score with the 6-dimension rubric; track variance across reruns.
- Log baselines and regressions; gate merges/deploys on thresholds.
Use the copy-paste templates in assets/ for day-0 setup.