skill-judge
Skill Judge
Evaluate Agent Skills against official specifications and patterns derived from 17+ official examples.
Core Philosophy
What is a Skill?
A Skill is NOT a tutorial. A Skill is a knowledge externalization mechanism.
Traditional AI knowledge is locked in model parameters. To teach new capabilities:
Traditional: Collect data → GPU cluster → Train → Deploy new version
Cost: $10,000 - $1,000,000+
Timeline: Weeks to months
More from hrdtbs/agent-skills
plan-self-review
Self-evaluate a plan on a 100-point scale after it is created or updated. Make sure to use this skill immediately whenever you create a plan or update a plan, even if the user does not explicitly ask for a review. This skill ensures that the plan is clear, comprehensive, feasible, and consistent before execution.
45create-pull-request
Create a GitHub pull request safely and reliably using project conventions. Make sure to use this skill whenever the user asks to create a PR, submit changes for review, open a pull request, or mentions "PR", "プルリク", or "pull request". It handles commit verification, branch validation, and PR creation using the gh CLI.
40commit
Expert-level commit creation and formatting following Conventional Commits. Make sure to use this skill whenever you need to create a commit message, save changes to git, structure a logical commit history, or when the user mentions 'commit', 'git commit', 'コミット', '変更をコミット', or asks you to push their code.
39mcp-builder
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
3prompt-evaluator
Evaluate and score user-written LLM prompts on a 100-point scale across 5 axes (Clarity, Structure, Information Content, Specificity, Context), providing specific improvement suggestions and a revised prompt. Make sure to use this skill whenever the user asks to evaluate, review, score, or improve a prompt, or when they say things like 'このプロンプトどう?', 'プロンプトを評価して', 'rate my prompt', 'review this prompt', or 'is this prompt good enough?'. This skill focuses on scoring existing prompts, not writing new ones from scratch.
3skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
3