ai-observability-promptfoo

Installation
SKILL.md

Promptfoo Patterns

Quick Guide: Use promptfoo for systematic LLM evaluation. Define prompts, providers, and test cases in promptfooconfig.yaml. Use assertion types (contains, is-json, llm-rubric, similar, cost, latency) to validate outputs. Use promptfoo eval to run (exits with code 100 on test failures), promptfoo view for results UI. Use model-graded assertions (llm-rubric, factuality) for subjective quality. Use promptfoo redteam run for security scanning. Use --share flag or promptfoo share to share results. All provider API keys come from environment variables -- never hardcode them.


<critical_requirements>

CRITICAL: Before Using This Skill

All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering, import type, named constants)

(You MUST define test cases with explicit assert arrays -- tests without assertions only capture output without validating it)

(You MUST use llm-rubric for subjective quality evaluation -- do NOT rely solely on deterministic assertions for natural language output)

(You MUST set threshold on similarity and model-graded assertions -- omitting thresholds uses defaults that may not match your quality bar)

(You MUST use environment variables for all API keys -- never hardcode keys in promptfooconfig.yaml or provider configs)

Related skills
Installs
2
GitHub Stars
6
First Seen
Apr 7, 2026