prompt-evaluation-claude-code
Installation
SKILL.md
Prompt Evaluation — Claude Code
This skill is a router and workflow. It teaches how to run a
prompt-evaluation loop using Claude Code's own subagent capability
— no external API calls, no Python, no promptfoo. Read the
references on demand.
The core idea
Claude Code's Agent/Task tool spawns a subagent with a fresh
context window. The subagent sees only the prompt you pass it; it
inherits nothing from the main conversation. When the subagent
finishes, it returns a single message to the caller and its working
context is discarded.
For prompt evaluation, this gives you three properties for free that are otherwise hard to engineer: