testing-agents-with-subagents
Testing Agents With Subagents
Overview
This skill applies TDD methodology to agent development — RED (observe failures), GREEN (fix agent definition), REFACTOR (edge cases and robustness) — with subagent dispatch as the execution mechanism.
Test what the agent DOES, not what the prompt SAYS. Evidence-based verification only: capture exact outputs from subagent dispatch, verify every prompt change through testing. Always test via the Task tool, always test via the Task tool rather than reading prompts.
Minimum test counts vary by agent type: Reviewer agents need 6 cases (2 real issues, 2 clean, 1 edge, 1 ambiguous), Implementation agents 5 cases (2 typical, 1 complex, 1 minimal, 1 error), Analysis agents 4 cases (2 standard, 1 edge, 1 malformed), Routing/orchestration 4 cases (2 correct route, 1 ambiguous, 1 invalid). No agent is simple enough to skip testing — get human confirmation before exempting any agent.
Each test runs in a fresh subagent to avoid context pollution. After any fix, re-run ALL test cases to catch regressions. One fix at a time — you cannot determine what changed the outcome with multiple simultaneous fixes.
Instructions
Phase 0: PREPARE — Understand the Agent
Goal: Read the agent definition and understand what it claims to do before writing tests.
More from notque/claude-code-toolkit
generate-claudemd
Generate project-specific CLAUDE.md from repo analysis.
12fish-shell-config
Fish shell configuration and PATH management.
12pptx-generator
PPTX presentation generation with visual QA: slides, pitch decks.
12codebase-overview
Systematic codebase exploration and architecture mapping.
10image-to-video
FFmpeg-based video creation from image and audio.
9data-analysis
Decision-first data analysis with statistical rigor gates.
9