eval-harness-updater

Installation
SKILL.md

Eval Harness Updater

Refresh eval harnesses to keep live + fallback modes actionable under unstable environments.

Focus Areas

  • Prompt and parser drift
  • Timeout/partial-stream handling
  • SLO and regression gates
  • Dual-run fallback consistency

Workflow

  1. Resolve harness path.
  2. Research test/eval best practices (Exa + arXiv — see Research Gate below).
  3. Add RED regressions for parsing and timeout edge cases.
  4. Patch minimal harness logic.
  5. Validate eval outputs and CI gates.
  6. Resolve companion artifact gaps (see Cross-Reference table below).
Related skills
Installs
44
GitHub Stars
27
First Seen
Feb 19, 2026