ai-evaluation
AI Evaluation (Evals)
Build systematic evaluation frameworks for AI/LLM products to measure quality, catch regressions, and improve model performance.
When to Use
- Building product with LLM/AI components
- Need to measure AI output quality systematically
- Comparing models or prompts (A/B testing)
- Detecting regressions before deployment
- Benchmarking against competitors
- Improving AI accuracy over time
- Explaining AI decisions to stakeholders
Core Concept
AI Evaluation (Evals) ≠ Traditional Testing
Traditional software: Deterministic (same input → same output)
More from sunnypatneedi/claude-starter-kit
learning-coach
Master evidence-based study techniques including active recall, spaced repetition, deliberate practice, and accelerated learning. Build efficient learning systems that maximize retention and skill acquisition.
79productivity-gtd
Master the Getting Things Done (GTD) methodology for stress-free productivity. Implement capture, clarify, organize, reflect, and engage workflows with contexts, projects, and weekly reviews.
64journaling
Build effective journaling practices for clarity, growth, and self-awareness. Includes morning pages, gratitude journaling, reflection prompts, problem-solving templates, and habit-building strategies.
40ugc-content-creator
Create short-form video content for TikTok, Instagram Reels, and YouTube Shorts. Master hooks, scripts, trends, platform-native styling, and the organic-to-ad content pipeline for maximum reach and engagement.
30writing-coach
Master clear writing, editing techniques, style improvement, and effective communication. Transform complex ideas into simple, compelling prose that readers understand and remember.
18daily-review
Guide a daily reflection and planning session
13