agent-platform-eval-flywheel
Installation
SKILL.md
Agent Platform Eval Flywheel Skill
Help users evaluate and iteratively improve GenAI models and agents using
the Agent Platform GenAI Evaluation SDK (google.genai / agentplatform).
When to use this skill
- Evaluating GenAI agents or models with the Agent Platform GenAI
Evaluation SDK (
client.evals.evaluate()). - Creating evaluation datasets from session traces, pandas DataFrames, or synthetic generation.
- Selecting, configuring, or writing custom evaluation metrics.
- Analyzing rubric verdicts, loss patterns, and clustering failures.
- Suggesting concrete code/prompt improvements based on eval results.
Setup
Install the SDK: