skill-forge-benchmark

Installation
SKILL.md

Skill Benchmarking & Performance Tracking

Measure and compare skill performance across iterations with statistical rigor using multiple trials, variance analysis, and trend tracking.

Process

Step 1: Define Benchmark Configuration

Accept configuration as:

  • Existing eval set: Path to evals/evals.json (from /skill-forge eval)
  • Benchmark config: Custom config with trial count and thresholds

Benchmark config schema:

{
  "skill_name": "my-skill",
  "skill_path": "./my-skill",
  "eval_set_path": "./evals/evals.json",
Related skills
Installs
12
GitHub Stars
54
First Seen
Apr 8, 2026