agent-evaluation

Installation

SKILL.md

Agent Evaluation with MLflow

Comprehensive guide for evaluating GenAI agents with MLflow. Use this skill for the complete evaluation workflow or individual components - tracing setup, environment configuration, dataset creation, scorer definition, or evaluation execution. Each section can be used independently based on your needs.

Table of Contents

Quick Start
Documentation Access Protocol
Setup Overview
Evaluation Workflow
References

Quick Start

Setup (prerequisite): Install MLflow 3.8+, configure environment, integrate tracing

Evaluation workflow in 4 steps:

Understand: Run agent, inspect traces, understand purpose

Related skills

More from b-step62/skills

Installs

9

Repository

b-step62/skills

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass