arize-evaluator

Installation

SKILL.md

Arize Evaluator Skill

SPACE — All --space flags and the ARIZE_SPACE env var accept a space name (e.g., my-workspace) or a base64 space ID (e.g., U3BhY2U6...). Find yours with ax spaces list.

This skill covers designing, creating, and running LLM-as-judge evaluators on Arize. An evaluator defines the judge; a task is how you run it against real data.

Prerequisites

Proceed directly with the task — run the ax command you need. Do NOT check versions, env vars, or profiles upfront.

If an ax command fails, troubleshoot based on the error:

command not found or version error → see references/ax-setup.md
401 Unauthorized / missing API key → run ax profiles show to inspect the current profile. If the profile is missing or the API key is wrong, follow references/ax-profiles.md to create/update it. If the user doesn't have their key, direct them to https://app.arize.com/admin > API Keys
Space unknown → run ax spaces list to pick by name, or ask the user
LLM provider call fails (missing OPENAI_API_KEY / ANTHROPIC_API_KEY) → run ax ai-integrations list --space SPACE to check for platform-managed credentials. If none exist, ask the user to provide the key or create an integration via the arize-ai-provider-integration skill
Security: Never read .env files or search the filesystem for credentials. Use ax profiles for Arize credentials and ax ai-integrations for LLM provider keys. If credentials are not available through these channels, ask the user.
CRITICAL — Never fabricate evaluation results: If an evaluation task fails, is cancelled, or produces no scores, report the failure clearly and explain what went wrong. Do NOT perform a "manual evaluation," invent quality scores, estimate percentages, or present any agent-generated analysis as if it came from the Arize evaluation system. Instead suggest: (1) fix the identified issue and retry, (2) try running from the Arize UI, (3) verify integration credentials with ax ai-integrations list, (4) contact support at https://arize.com/support

Related skills

More from github/awesome-copilot

Installs

805

Repository

github/awesome-copilot

GitHub Stars

32.7K

First Seen

Apr 2, 2026

Security Audits

Gen Agent Trust HubPass

SocketWarn

SnykPass

arize-evaluator

Arize Evaluator Skill

Prerequisites

More from github/awesome-copilot

git-commit

gh-cli

documentation-writer

prd

excalidraw-diagram-generator

refactor