llm-evaluation
LLM Evaluation
Master comprehensive evaluation strategies for LLM applications, from automated metrics to human evaluation and A/B testing.
When to Use This Skill
- Measuring LLM application performance systematically
- Comparing different models or prompts
- Detecting performance regressions before deployment
- Validating improvements from prompt changes
- Building confidence in production systems
- Establishing baselines and tracking progress over time
- Debugging unexpected model behavior
Core Evaluation Types
1. Automated Metrics
Fast, repeatable, scalable evaluation using computed scores.
More from hermeticormus/libreuiux-claude-code
premium-saas-design
Professional framework for building premium $5k+ SaaS websites with AI - the Define, Build, Review, Refine loop used by real product teams
124design-masters
Deep knowledge of legendary designers and their enduring contributions. Learn from Saul Bass, Massimo Vignelli, Dieter Rams, Paula Scher, and others whose work defines excellence. Use when seeking inspiration, understanding design history, or applying proven approaches.
37design-principles
Core visual design principles that underpin all great design. Master gestalt psychology, visual hierarchy, composition, color theory, and typography fundamentals. Use when making design decisions or evaluating designs against proven principles.
35prompt-engineering-ui
Prompt patterns for consistent UI generation. Covers precise design intent communication, component specification formats, and iterative refinement patterns for LLM-driven UI development.
34brand-systems
Building comprehensive brand identity systems from strategy to implementation. Covers logo design, color palettes, typography pairing, voice guidelines, and system documentation. Use when creating new brands, rebranding, or systematizing existing identities.
33design-system-context
Managing design tokens and system context for LLM-driven UI development. Covers loading, persisting, and optimizing design decisions within context windows.
32