skill-eval

Installation
SKILL.md

Skill Eval

Eval-driven development for skills: draft, scaffold evals, test, grade, review, improve, repeat.

Your job is to figure out where the user is in this process and jump in. Maybe they want to test an existing skill. Maybe they want to optimize a description. Maybe they want the full ship pipeline. Be flexible — if the user doesn't want evals, skip them.

Environment Detection

Before starting, detect the environment:

  1. Claude Code — check which claude. Full capabilities: subagents, browser, claude -p for trigger testing.
  2. Cowork — check for Task tools. Subagents work, but no display. Use --static <path> for HTML viewer output.
  3. Claude.ai — fallback. Inline testing only. Skip benchmarking and description optimization (requires claude CLI).

Adapt workflows below based on detected environment.

Quick Start

Three steps to evaluate any skill:

Related skills

More from bluewaves-creations/bluewaves-skills

Installs
3
GitHub Stars
1
First Seen
Mar 9, 2026