eval-harness

Installation
SKILL.md

Eval Harness Skill

A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.

Philosophy

Eval-Driven Development treats evals as the "unit tests of AI development":

  • Define expected behavior BEFORE implementation
  • Run evals continuously during development
  • Track regressions with each change
  • Use pass@k metrics for reliability measurement

Eval Types

Installs
10
GitHub Stars
507
First Seen
Mar 23, 2026
eval-harness — vibeeval/vibecosystem