skill-testing
Skill Testing
Create LLM-as-judge behavioral evals for agent skills.
What This Produces
<skill>/tests/
eval.sh # evaluation harness (from template)
golden_examples.yaml # test scenarios
Workflow
1. Understand the skill
Read the target skill's SKILL.md and any supplementary files. Identify:
- Core behaviors the skill enforces
More from jrollin/claudio
spec-create
Create a new feature specification following a phased workflow. Use when starting a new feature that needs requirements, design, and task planning. Invoke for spec-driven development, feature specification, requirements-design-tasks workflow.
1spec-impl
Task-by-task implementer that reads a completed spec and executes each task atomically. Use when a feature spec exists and you're ready to implement. Invoke for spec implementation, task execution, spec-driven development.
1agent-browser
when asking to check ui or tests automation in browser
1event-modeling-tasks
Use when translating a completed event model into implementation tasks. Invoke when an event model with slices and specifications exists and needs to become a development plan, task breakdown, or spec-create compatible output.
1event-modeling-spec
Use when designing systems with Event Modeling methodology, creating event models, or when user mentions event modeling, commands/events/views blueprints, system timeline design, or CQRS system design workshops.
1