skill-selection-evals

Installation
SKILL.md

Skill-Selection Evals

This is not an executable skill. It contains evaluation data for measuring the accuracy of skill selection (routing) decisions.

Purpose

Crucible's 49 execution evals measure quality once a skill is invoked. Selection evals measure whether the right skill gets invoked in the first place.

Eval Types

  • Direct selection: Given a prompt, does the agent pick the correct skill?
  • Negative selection: Given a prompt that sounds like skill X but is not, does the agent avoid the false positive?
  • Context-dependent: Same verb, different context, different correct skill.
  • Cascade ordering: Multi-skill tasks requiring correct invocation order.

Boundaries Tested

Installs
1
Repository
raddue/crucible
GitHub Stars
10
First Seen
Apr 20, 2026
skill-selection-evals — raddue/crucible