skill-judge
Comprehensive framework for evaluating Agent Skill design quality against official specifications and best practices.
- Provides eight distinct evaluation dimensions (Knowledge Delta, Mindset, Anti-Patterns, Specification, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Usability) totaling 120 points, with Knowledge Delta weighted as the most critical measure of genuine expert value
- Emphasizes the core principle that Skill value equals "expert-only knowledge minus what Claude already knows," treating token efficiency as a public resource and flagging redundant content as waste
- Includes detailed scoring rubrics, common failure patterns (Tutorial Dump, Orphan References, Vague Warnings), and a quick-reference checklist for rapid evaluation
- Provides a complete evaluation protocol with five-step process: knowledge delta scan, structure analysis, dimension scoring, grade calculation, and report generation with specific improvement recommendations
Skill Judge
Evaluate Agent Skills against official specifications and patterns derived from 17+ official examples.
Core Philosophy
What is a Skill?
A Skill is NOT a tutorial. A Skill is a knowledge externalization mechanism.
Traditional AI knowledge is locked in model parameters. To teach new capabilities:
Traditional: Collect data → GPU cluster → Train → Deploy new version
Cost: $10,000 - $1,000,000+
Timeline: Weeks to months
More from softaworks/agent-toolkit
mermaid-diagrams
Comprehensive guide for creating software diagrams using Mermaid syntax. Use when users need to create, visualize, or document software through diagrams including class diagrams (domain modeling, object-oriented design), sequence diagrams (application flows, API interactions, code execution), flowcharts (processes, algorithms, user journeys), entity relationship diagrams (database schemas), C4 architecture diagrams (system context, containers, components), state diagrams, git graphs, pie charts, gantt charts, or any other diagram type. Triggers include requests to "diagram", "visualize", "model", "map out", "show the flow", or when explaining system architecture, database design, code structure, or user/application flows.
4.0Khumanizer
|
3.9Kwriting-clearly-and-concisely
Use when writing prose humans will read—documentation, commit messages, error messages, explanations, reports, or UI text. Applies Strunk's timeless rules for clearer, stronger, more professional writing.
3.8Kqa-test-planner
Generate comprehensive test plans, manual test cases, regression test suites, and bug reports for QA engineers. Includes Figma MCP integration for design validation.
3.7Kdatabase-schema-designer
Design robust, scalable database schemas for SQL and NoSQL databases. Provides normalization guidelines, indexing strategies, migration patterns, constraint design, and performance optimization. Ensures data integrity, query performance, and maintainable data models.
3.7Kagent-md-refactor
Refactor bloated AGENTS.md, CLAUDE.md, or similar agent instruction files to follow progressive disclosure principles. Splits monolithic files into organized, linked documentation.
3.7K