arize-experiment
Installation
SKILL.md
Arize Experiment Skill
SPACE— All--spaceflags and theARIZE_SPACEenv var accept a space name (e.g.,my-workspace) or a base64 space ID (e.g.,U3BhY2U6...). Find yours withax spaces list.
Concepts
- Experiment = a named evaluation run against a specific dataset version, containing one run per example
- Experiment Run = the result of processing one dataset example -- includes the model output, optional evaluations, and optional metadata
- Dataset = a versioned collection of examples; every experiment is tied to a dataset and a specific dataset version
- Evaluation = a named metric attached to a run (e.g.,
correctness,relevance), with optional label, score, and explanation
The typical flow: export a dataset → process each example → collect outputs and evaluations → create an experiment with the runs.
Prerequisites
Proceed directly with the task — run the ax command you need. Do NOT check versions, env vars, or profiles upfront.