Domain Evaluation Harness

Installation
SKILL.md

Domain Evaluation Harness

The harness is the bridge between the HyperAgents evolution loop and domain-specific evaluation. It defines how to load tasks, run the agent, collect predictions, and compute fitness scores.

Harness Architecture

┌──────────────┐     ┌─────────────┐     ┌──────────────┐
│  Task List   │────>│   Harness   │────>│  Predictions │
│  (input)     │     │  (executor) │     │  (output)    │
└──────────────┘     └──────┬──────┘     └──────┬───────┘
                            │                    │
                     ┌──────▼──────┐     ┌──────▼───────┐
                     │ Task Agent  │     │   Reporter   │
                     │ (modified)  │     │  (scorer)    │
                     └─────────────┘     └──────┬───────┘
                                         ┌──────▼───────┐
                                         │  report.json │
Related skills
Installs
First Seen