eval-harness-updater
Installation
SKILL.md
Eval Harness Updater
Refresh eval harnesses to keep live + fallback modes actionable under unstable environments.
Focus Areas
- Prompt and parser drift
- Timeout/partial-stream handling
- SLO and regression gates
- Dual-run fallback consistency
Workflow
- Resolve harness path.
- Research test/eval best practices (Exa + arXiv — see Research Gate below).
- Add RED regressions for parsing and timeout edge cases.
- Patch minimal harness logic.
- Validate eval outputs and CI gates.
- Resolve companion artifact gaps (see Cross-Reference table below).