autoresearch-fleet
Autoresearch Fleet
Autonomous research loop inspired by karpathy/autoresearch. One mutable file, one immutable eval harness, git as state machine, and a "NEVER STOP" directive. The agent modifies code, evaluates the result, keeps improvements, discards regressions, and repeats indefinitely.
Open-world extension: when the agent plateaus (N consecutive discards), the orchestrator injects a web-search prompt, breaking through knowledge ceilings the LLM can't cross alone.
When to use
- Optimizing a single metric (latency, accuracy, loss, score)
- The problem has a fast, deterministic eval harness
- You want autonomous overnight runs (100+ experiments while you sleep)
- The search space is too large for manual exploration
How it works
More from quickcall-dev/skills
doc
Create and manage structured documentation — experiments, plans, findings, checkpoints, research, learnings. Config-driven, parallel-safe.
30worktree-fleet
Independence-validated parallel fleet that runs each worker (claude -p, codex exec, or pi -p) in its own git worktree. Use when tasks touch non-overlapping files and you need merge-safe isolation (each worker on its own branch). For DAG-ordered one-shot workers with budgets, use dag-fleet. For headless iteration with a reviewer loop, use iterative-fleet.
30fleet-plan
Analyze a task, pick the right fleet type, and generate a ready-to-launch fleet (fleet.json + prompt.md files). Discovers available fleet skills dynamically. Use when the user wants to run work in parallel, asks to "plan a fleet", or says "fleet-plan".
29dag-fleet
Persistent, budgeted, DAG-ordered runner for parallel `claude -p`, `codex exec`, or `pi -p` workers in tmux. Use ONLY when you need persistence across sessions, per-worker budget caps, dependency ordering, or mixed models/providers per worker. For ad-hoc parallel sub-agents inside a live conversation, use Claude Code's built-in Agent tool instead.
29iterative-fleet
Reviewer-gated iterative fleet for headless `claude -p`, `codex exec`, or `pi -p` workers that run in cycles until a designated reviewer approves the output. Use when the work needs multiple rounds of iteration with a quality gate — a reviewer worker reads all worker logs, writes a verdict (lgtm | iterate | escalate), and the orchestrator decides whether to continue, pause, or stop. NEVER kills or restarts workers automatically; the operator owns all kill/pause decisions.
29