skill-forge-benchmark

Installation
SKILL.md

Skill Benchmarking & Performance Tracking

Measure and compare skill performance across iterations with statistical rigor using multiple trials, variance analysis, and trend tracking.

Process

Step 1: Define Benchmark Configuration

Accept configuration as:

  • Existing eval set: Path to evals/evals.json (from /skill-forge eval)
  • Benchmark config: Custom config with trial count and thresholds
Installs
27
GitHub Stars
66
First Seen
Apr 8, 2026
skill-forge-benchmark — agricidaniel/skill-forge