skill-evaluator
Skill Evaluator
Purpose
skill-evaluator reviews failures and golden case candidates in Thinking Skills, then recommends the smallest useful improvement or preservation action.
It does not rewrite skills by default. It diagnoses the case, proposes eval coverage, checks for overfitting or conflicts, and suggests a minimal patch or preservation plan.
When to Use
Use this skill when:
More from huajiexiewenfeng/thinking-skills
content-creator
Use when the user is developing articles, essays, posts, newsletters, scripts, talks, titles, outlines, arguments, audience positioning, drafts, or content structure.
2learning-coach
Use when the user wants to understand a concept, learn a topic, build a mental model, explain something in simpler terms, find knowledge gaps, make a study plan, practice retrieval, or turn confusing material into usable understanding.
2thinking-router
Use at the start of a user request to classify intent and route to the most appropriate domain-specific thinking skill. Do not assume software development unless explicitly indicated.
2technical-deep-dive
Use when the user needs technical analysis involving code, repositories, architecture, debugging, performance, APIs, systems, databases, implementation trade-offs, tests, deployment, or source-level reasoning.
2conversation-review
Use when the user asks for self-review, Dolores mode, conversation review, skill trace audit, failure analysis, eval gap detection, improvement-loop suggestions, failure case or golden case status, or skill feedback dashboard.
2benchmark-assistant
Use when the user wants to run Thinking Skills benchmarks, test a skill, check regressions, generate benchmark prompts, score saved benchmark responses, update or compare the benchmark dashboard, add a benchmark case, interpret benchmark results, or decide what to patch after a benchmark failure.
2