agent-evaluation

Installation
SKILL.md

Agent Evaluation with MLflow

Comprehensive guide for evaluating GenAI agents with MLflow. Use this skill for the complete evaluation workflow or individual components - tracing setup, environment configuration, dataset creation, scorer definition, or evaluation execution. Each section can be used independently based on your needs.

Table of Contents

  1. Quick Start
  2. Documentation Access Protocol
  3. Setup Overview
  4. Evaluation Workflow
  5. References

Quick Start

Setup (prerequisite): Install MLflow 3.8+, configure environment, integrate tracing

Evaluation workflow in 4 steps:

  1. Understand: Run agent, inspect traces, understand purpose
Related skills
Installs
9
Repository
b-step62/skills
First Seen
Jan 20, 2026