agent-evaluation

Installation

SKILL.md

Agent Evaluation

Overview

LLM-as-judge evaluation framework that scores AI-generated content on 5 dimensions using a 1-5 rubric. Agents evaluate outputs, compute a weighted composite score, and emit a structured verdict with evidence citations.

Core principle: Systematic quality verification before claiming completion. Agent-studio currently has no way to verify agent output quality — this skill fills that gap.

When to Use

Always:

Before marking a task complete (pair with verification-before-completion)
After a plan is generated (evaluate plan quality)
After code review outputs (evaluate review quality)
During reflection cycles (evaluate agent responses)
When comparing multiple agent outputs

Don't Use:

Related skills

More from oimiragieo/agent-studio

Installs

28

Repository

oimiragieo/agent-studio

GitHub Stars

27

First Seen

Feb 25, 2026

Security Audits

Gen Agent Trust HubWarn