eval-creator

Installation
SKILL.md

Eval Creator

Turns promoted learnings into permanent eval cases. Runs regression checks to verify promoted rules hold. This is the outer loop's regress-test step.

The blog says: "If a failure taught you something important, it should become a permanent test case. Otherwise the knowledge is still fragile."

When to Use

  • After harness-updater promotes a pattern — create an eval for it
  • On cadence — run all evals to check for regression
  • Before major releases — verify the harness is holding
  • When a promoted rule seems to have stopped working — diagnose with targeted eval run

Eval Directory Structure

Installs
3
GitHub Stars
203
First Seen
12 days ago
eval-creator — pskoett/pskoett-skills