Judge

Overview

Evaluate agent task outputs against a three-dimension rubric and produce structured verdict records. The judge operates as a quality gate at the task completion boundary, scoring outputs on Semantic accuracy, Pragmatic usefulness, and Syntactic consistency.

Rubric: Three KLS Dimensions

The rubric reuses three dimensions from the KLS (Krogstie-Lindland-Sindre) quality framework defined in disciplined-quality-evaluation:

Dimension	Question	Criteria
Semantic	Does it accurately represent the domain?	Factual correctness, domain terminology, no contradictions
Pragmatic	Does it enable the intended decisions/actions?	Actionable, useful, addresses the task goal
Syntactic	Is it internally consistent and well-structured?	Format compliance, structural completeness, no broken references