phoenix-evals

Installation
SKILL.md

Phoenix Evals

Build evaluators for AI/LLM applications. Code first, LLM for nuance, validate against humans.

Quick Reference

Task Files
Setup setup-python, setup-typescript
Decide what to evaluate evaluators-overview
Choose a judge model fundamentals-model-selection
Use pre-built evaluators evaluators-pre-built
Build code evaluator evaluators-code-python, evaluators-code-typescript
Build LLM evaluator evaluators-llm-python, evaluators-llm-typescript, evaluators-custom-templates
Batch evaluate DataFrame evaluate-dataframe-python
Run experiment experiments-running-python, experiments-running-typescript
Create dataset experiments-datasets-python, experiments-datasets-typescript
Generate synthetic data experiments-synthetic-python, experiments-synthetic-typescript
Validate evaluator accuracy validation, validation-evaluators-python, validation-evaluators-typescript
Related skills

More from arize-ai/phoenix

Installs
432
GitHub Stars
9.6K
First Seen
Jan 27, 2026