Prompt Evaluator

Evaluate LLM prompts on a 100-point scale based on research findings from Thorgeirsson et al. (2026), which demonstrated that writing quality—specifically coherence, instructional clarity, and information content—significantly predicts LLM-assisted programming performance.

Key Research Insights

Information content > vocabulary: Adding missing information improves results; rewording without adding information rarely helps (Lucchetti et al.)
Structure matters: Unorganized, vague prompts lead to failure cycles
Declarative > interrogative: Declarative statements outperform questions (Chen et al.)
Ambiguity kills: Unclear pronouns, implicit assumptions, and missing constraints are top failure causes

Evaluation Workflow

Receive the user's prompt
Read references/evaluation-rubric.md for detailed scoring criteria
Score each of the 5 axes (4 sub-items × 5pt = 20pt per axis, 100pt total)
For common issues, consult references/improvement-patterns.md for Before/After examples
Output the evaluation result using the template below
Provide a revised prompt

prompt-evaluator

Prompt Evaluator

Key Research Insights

Evaluation Workflow

More from hrdtbs/agent-skills

plan-self-review

create-pull-request

commit

mcp-builder

skill-judge

skill-creator