compare
Installation
SKILL.md
Compare: same-epoch run comparison across trackers
The most common comparison error is reporting "run A is 4 percentage points behind baseline" when run A is at epoch 11 of 100 and the baseline number is from epoch 100. The student is still training; the comparison is meaningless. This skill enforces same-epoch alignment.
The agentic Stop hook routes here from reason when an assistant reports a delta without aligning the runs.
When to run
The user just said any of:
- "compare run A to baseline / to run B"
- "is my run improving / catching up / falling behind"
- "rank these experiments"
- "X vs Y wandb / neptune"
- "track lag against baseline"
Auto-detect the tracker
Check in this order: