skill-eval-improve

Installation
SKILL.md

Skill eval & improve

Improve skills measurably: baseline → measure → bounded edit → re-validate. Combine local tooling, Codex plugin-eval (when installed), and research-backed loops (SkillOpt).

When to use

  • Skill triggers wrong or never loads (description routing)
  • Bloated SKILL.md, high token cost, weak outcomes
  • After adding a new procedure—need regression checks
  • Porting patterns from product MCP / plugin-eval research into Guild skills

When not to use

  • Bulk repo validation — e.g. “validate every skill in this repo” → pnpm run validate only (skill-spec-review for audit); do not start benchmark or SkillOpt loops.
  • Automated SkillOpt / cluster training — Guild documents a manual bounded-edit loop; no overnight optimizer pipeline.
  • Creating a new skill — use create-skill first; eval-improve applies after a skill exists.

Cursor scope (optional): activate when editing under skills/** or scripts/validate-skills.mjs.

Related skills

More from arenukvern/skill_steward

Installs
1
GitHub Stars
1
First Seen
1 day ago
skill-eval-improve — arenukvern/skill_steward