autoresearch

Installation
SKILL.md

Autoresearch for Skills

Most skills work about 70% of the time. The other 30% you get garbage. The fix isn't to rewrite the skill from scratch. It's to let an agent run it dozens of times, score every output, and tighten the prompt until that 30% disappears.

This skill adapts Andrej Karpathy's autoresearch methodology (autonomous experimentation loops) to Claude Code skills. Instead of optimizing ML training code, we optimize skill prompts.


the core job

Take any existing skill, define what "good output" looks like as binary yes/no checks, then run an autonomous loop that:

  1. Generates outputs from the skill using test inputs
  2. Scores every output against the eval criteria
  3. Mutates the skill prompt to fix failures
  4. Keeps mutations that improve the score, discards the rest
  5. Repeats until the score ceiling is hit or the user stops it

Output: An improved SKILL.md + results.tsv log + changelog.md of every mutation attempted + a live HTML dashboard you can watch in your browser.

Related skills
Installs
1
Repository
aresbit/matebot
GitHub Stars
45
First Seen
Apr 14, 2026