agent-eval

Originally fromaffaan-m/everything-claude-code

Installation

SKILL.md

Agent Eval Skill

A lightweight CLI tool for comparing coding agents head-to-head on reproducible tasks. Every "which coding agent is best?" comparison runs on vibes — this tool systematizes it.

When to Activate

Comparing coding agents (Claude Code, Aider, Codex, etc.) on your own codebase
Measuring agent performance before adopting a new tool or model
Running regression checks when an agent updates its model or tooling
Producing data-backed agent selection decisions for a team

Installation

Note: Install agent-eval from its repository after reviewing the source.

Core Concepts

YAML Task Definitions

Installs

1.3K

Repository

GitHub Stars

232.5K

First Seen

May 19, 2026

Security Audits

Gen Agent Trust HubPass

agent-eval — affaan-m/ecc