prompt-evaluator
Prompt Evaluator
Evaluate LLM prompts on a 100-point scale based on research findings from Thorgeirsson et al. (2026), which demonstrated that writing quality—specifically coherence, instructional clarity, and information content—significantly predicts LLM-assisted programming performance.
Key Research Insights
- Information content > vocabulary: Adding missing information improves results; rewording without adding information rarely helps (Lucchetti et al.)
- Structure matters: Unorganized, vague prompts lead to failure cycles
- Declarative > interrogative: Declarative statements outperform questions (Chen et al.)
- Ambiguity kills: Unclear pronouns, implicit assumptions, and missing constraints are top failure causes
Evaluation Workflow
- Receive the user's prompt
- Read references/evaluation-rubric.md for detailed scoring criteria
- Score each of the 5 axes (4 sub-items × 5pt = 20pt per axis, 100pt total)
- For common issues, consult references/improvement-patterns.md for Before/After examples
- Output the evaluation result using the template below
- Provide a revised prompt
More from hrdtbs/agent-skills
plan-self-review
Self-evaluate a plan on a 100-point scale after it is created or updated. Make sure to use this skill immediately whenever you create a plan or update a plan, even if the user does not explicitly ask for a review. This skill ensures that the plan is clear, comprehensive, feasible, and consistent before execution.
45create-pull-request
Create a GitHub pull request safely and reliably using project conventions. Make sure to use this skill whenever the user asks to create a PR, submit changes for review, open a pull request, or mentions "PR", "プルリク", or "pull request". It handles commit verification, branch validation, and PR creation using the gh CLI.
40commit
Expert-level commit creation and formatting following Conventional Commits. Make sure to use this skill whenever you need to create a commit message, save changes to git, structure a logical commit history, or when the user mentions 'commit', 'git commit', 'コミット', '変更をコミット', or asks you to push their code.
39mcp-builder
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
3skill-judge
Evaluate Agent Skill design quality against official specifications and best practices. Use when reviewing, auditing, or improving SKILL.md files and skill packages. Provides multi-dimensional scoring and actionable improvement suggestions.
3skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
3