promptfoo-evaluation

Installation
SKILL.md

Promptfoo Evaluation

Overview

This skill provides guidance for configuring and running LLM evaluations using Promptfoo, an open-source CLI tool for testing and comparing LLM outputs.

When to Use

  • Validating prompt quality, rubric alignment, or regression behavior across different LLM providers.
  • Automating model comparisons for bug bounties, research, or QA before releasing prompts into production.
  • Creating custom Python assertions or llm-rubric grades that Claude will execute under pressure tests.

When NOT to Use

  • Quickly testing prompts ad-hoc without needing structured test cases or automation.
  • Non-LLM evaluation work such as standard unit tests or infrastructure monitoring.
  • Requesting only human-readable advice without running CLI-based evaluations.

Quick Start

Related skills
Installs
4
First Seen
Feb 3, 2026