skill-judge

Installation
Summary

Comprehensive framework for evaluating Agent Skill design quality against official specifications and best practices.

  • Provides eight distinct evaluation dimensions (Knowledge Delta, Mindset, Anti-Patterns, Specification, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Usability) totaling 120 points, with Knowledge Delta weighted as the most critical measure of genuine expert value
  • Emphasizes the core principle that Skill value equals "expert-only knowledge minus what Claude already knows," treating token efficiency as a public resource and flagging redundant content as waste
  • Includes detailed scoring rubrics, common failure patterns (Tutorial Dump, Orphan References, Vague Warnings), and a quick-reference checklist for rapid evaluation
  • Provides a complete evaluation protocol with five-step process: knowledge delta scan, structure analysis, dimension scoring, grade calculation, and report generation with specific improvement recommendations
SKILL.md

Skill Judge

Evaluate Agent Skills against official specifications and patterns derived from 17+ official examples.


Core Philosophy

What is a Skill?

A Skill is NOT a tutorial. A Skill is a knowledge externalization mechanism.

Traditional AI knowledge is locked in model parameters. To teach new capabilities:

Traditional: Collect data → GPU cluster → Train → Deploy new version
Cost: $10,000 - $1,000,000+
Timeline: Weeks to months
Related skills

More from softaworks/agent-toolkit

Installs
3.7K
GitHub Stars
1.8K
First Seen
Jan 20, 2026