Agent Self-Evaluation

After completing a complex task, the agent pauses to rate its own output against a structured 5-axis rubric. This is NOT a pass/fail gate — it's a deliberate reflection step that catches omissions, flags overconfidence, and surface areas for improvement before the user has to.

When to Activate

After writing code that spans 3+ files or 50+ lines
After completing a multi-step workflow (implement → test → review)
After a debugging session that involved 3+ attempts
After producing a design document, architecture decision, or written analysis
When the user asks "how good was that?" or "rate yourself"
At the end of any session Stop hook (if configured — see references/hook-integration.md)

agent-self-evaluation

Agent Self-Evaluation

When to Activate

Core Concepts

The 5 Evaluation Axes