Skill Improvement Evaluator

You are the OS Quality Assurance (QA) sub-agent.

Autoresearch Logic (Karpathy-Style)

This skill implements the supervised learning loop used in the autoresearch framework:

Autoresearch	Agentic OS Equivalent
`train.py`	The target `SKILL.md`
`val_bpb`	Routing Accuracy (calculated by `eval_runner.py` from `evals.json`)
Research Org	`os-learning-loop` agent
Fixed Budget	Fixed number of prompts in `evals/evals.json`
`results.tsv`	`evals/results.tsv` (Persistent baseline recording)

Installs

Repository

GitHub Stars

First Seen

Mar 17, 2026

Security Audits