Autoresearch: Autonomous Iterative Experimentation

An autonomous experimentation loop for any programming task. You define the goal and how to measure it; the agent iterates autonomously -- modifying code, running experiments, measuring results, and keeping or discarding changes -- until interrupted.

This skill is inspired by Karpathy's autoresearch, generalized from ML training to any programming task with a measurable outcome.

Agent Behavior Rules

DO guide the user through the Setup phase interactively before starting the loop.
DO establish a baseline measurement before making any changes.
DO commit every experiment attempt before running it (so it can be reverted cleanly).
DO keep a results log (TSV) tracking every experiment.
DO revert changes that do not improve the metric (git reset to last known good).
DO run autonomously once the loop starts -- never pause to ask "should I continue?".
DO NOT modify files the user marked as out-of-scope.
DO NOT skip the measurement step -- every experiment must be measured.
DO NOT keep changes that regress the metric unless the user explicitly allowed trade-offs.

autoresearch

Autoresearch: Autonomous Iterative Experimentation

Agent Behavior Rules

More from github/awesome-copilot

git-commit

gh-cli

documentation-writer

prd

excalidraw-diagram-generator

refactor