eval-creator
Installation
SKILL.md
Eval Creator
Turns promoted learnings into permanent eval cases. Runs regression checks to verify promoted rules hold. This is the outer loop's regress-test step.
The blog says: "If a failure taught you something important, it should become a permanent test case. Otherwise the knowledge is still fragile."
When to Use
- After harness-updater promotes a pattern — create an eval for it
- On cadence — run all evals to check for regression
- Before major releases — verify the harness is holding
- When a promoted rule seems to have stopped working — diagnose with targeted eval run