ai-evals

SKILL.md

AI Evals

Help the user create systematic evaluations for AI products using insights from AI practitioners.

How to Help

When the user asks for help with AI evals:

  1. Understand what they're evaluating - Ask what AI feature or model they're testing and what "good" looks like
  2. Help design the eval approach - Suggest rubrics, test cases, and measurement methods
  3. Guide implementation - Help them think through edge cases, scoring criteria, and iteration cycles
  4. Connect to product requirements - Ensure evals align with actual user needs, not just technical metrics

Core Principles

Evals are the new PRD

Brendan Foody: "If the model is the product, then the eval is the product requirement document." Evals define what success looks like in AI products—they're not optional quality checks, they're core specifications.

Installs
2
First Seen
Mar 24, 2026