experiment-queue

Installation
SKILL.md

Experiment Queue

Orchestrate large batches of ML experiments on SSH remote GPU servers with proper state tracking, OOM retry, stale cleanup, and wave transitions.

When to Use This Skill

Use when /run-experiment is insufficient:

  • ≥10 jobs that need batching across GPUs
  • Multi-seed sweeps (e.g., 21 seeds × 12 cells)
  • Wave transitions (run wave 1, wait, run wave 2, wait, run wave 3...)
  • Teacher+student chains (train teacher then distill; auto-trigger student after teacher done)
  • OOM-prone configs where you need to retry with different GPU or wait
  • Mixed seed grids where failed cells need re-running

Do NOT use for:

  • Single ad-hoc experiment (use /run-experiment)
  • Modal/Vast.ai deployments (those have their own orchestration)
  • Experiments that need manual inspection between runs
Related skills

More from wanshuiyin/auto-claude-code-research-in-sleep

Installs
56
GitHub Stars
9.2K
First Seen
Apr 16, 2026