arbor

Installation
SKILL.md

Arbor — Autonomous Optimization via Hypothesis Tree Refinement

Overview

This skill runs an Autonomous Optimization (AO) loop: starting from an existing artifact and a measurable objective, improve it through many rounds of experiment and evaluation — without step-by-step human supervision and without overfitting to the feedback signal. It's the right tool when the bottleneck isn't writing one good change, but organizing dozens of trials so that lessons accumulate instead of evaporating.

It implements Hypothesis Tree Refinement (HTR) from Arbor (Jin et al., 2026). The key idea: keep the research state in a persistent hypothesis tree rather than in conversation history. Each node binds a hypothesis, the distilled insight it produced, and a pointer to the artifact version that realizes it. You play the long-lived coordinator that owns this tree and decides where to search; short-lived executor subagents test one hypothesis each in isolated git worktrees and report back. A held-out merge gate admits a change only when it improves on a test evaluator the search never optimized against. This is what turns trial-and-error into cumulative, auditable research.

Use the scripts/tree.py state manager for all the bookkeeping (creating nodes, writing evidence, propagating insights, pruning, the merge gate, the Observe projection). It keeps the state consistent and frees you to spend judgment on what the evidence means.

When to use this skill

Reach for Arbor when the task is iterative improvement of a concrete artifact under an evaluator:

  • Model training: optimizer/architecture/recipe changes to lower loss or hit a target in fewer steps.
  • Harness/agent engineering: raising pass rate or accuracy of an agent loop, search harness, or tool-use scaffold.
  • Data synthesis: improving a generation/filtering pipeline judged by downstream model behavior.
  • Benchmark optimization: MLE-bench / Kaggle-style "improve the submission" tasks.
  • Prompt/system optimization where you can score outputs automatically.
Installs
98
GitHub Stars
29.4K
First Seen
14 days ago
arbor — k-dense-ai/scientific-agent-skills