judge-verification
Installation
SKILL.md
Judge Verification
Overview
An independent LLM evaluation layer that verifies whether a task was genuinely completed. The judge reviews the original task goal, the sequence of actions taken, the final state of relevant artifacts, and the claimed completion evidence — then produces a PASS/FAIL verdict with a confidence score and actionable reasoning.
This skill is distinct from verification-before-completion: that skill runs checklist gates
within the same agent context. Judge-verification uses a fresh, independent perspective
with no access to the executing agent's prior reasoning, catching hallucinated success claims.
When to Use
Skill({ skill: 'judge-verification' });