evaluate-skill

Installation

SKILL.md

Evaluate Skill

Orchestrate a cross-tier evaluation of an AI skill to determine its clarity and robustness.

Procedure

Load inputs
- Read the skill file at {{ skill-path }}
- Read the test cases file at {{ test-cases-path }}
- Validate that test cases is a JSON array of objects with input and expectedOutcome fields
Set up evaluation matrix
- Model tiers to test: opus, sonnet, haiku
- For each tier, for each test case: plan one blind test run

Installs

1

Repository

8090-inc/softwa…y-plugin

GitHub Stars

4

First Seen

2 days ago

Security Audits

Gen Agent Trust HubPass

evaluate-skill — 8090-inc/software-factory-plugin