agent-platform-eval-flywheel

Installation
SKILL.md

Agent Platform Eval Flywheel Skill

Help users evaluate and iteratively improve GenAI models and agents using the Agent Platform GenAI Evaluation SDK (google.genai / agentplatform).

When to use this skill

  • Evaluating GenAI agents or models with the Agent Platform GenAI Evaluation SDK (client.evals.evaluate()).
  • Creating evaluation datasets from session traces, pandas DataFrames, or synthetic generation.
  • Selecting, configuring, or writing custom evaluation metrics.
  • Analyzing rubric verdicts, loss patterns, and clustering failures.
  • Suggesting concrete code/prompt improvements based on eval results.

Setup

Install the SDK:

Installs
355
Repository
google/skills
GitHub Stars
11.9K
First Seen
6 days ago
agent-platform-eval-flywheel — google/skills