Skill-Selection Evals

This is not an executable skill. It contains evaluation data for measuring the accuracy of skill selection (routing) decisions.

Purpose

Crucible's 49 execution evals measure quality once a skill is invoked. Selection evals measure whether the right skill gets invoked in the first place.

Eval Types

Direct selection: Given a prompt, does the agent pick the correct skill?
Negative selection: Given a prompt that sounds like skill X but is not, does the agent avoid the false positive?
Context-dependent: Same verb, different context, different correct skill.
Cascade ordering: Multi-skill tasks requiring correct invocation order.

Boundaries Tested

Installs

Repository

raddue/crucible

GitHub Stars

First Seen

Apr 20, 2026

Security Audits

Gen Agent Trust HubWarn

SocketPass

SnykPass