ML Experimentation

This skill guides a hypothesis-driven ML experiment life cycle: planning, fast iteration, script execution, targeted logging, journaling, diagnostic visualization, and scientific report writing.

Usage

Use this skill when the user wants to run an ML experiment, test a model or idea, or write up experiment results. First decide: new experiment (different question → new experiment directory) or new run (same question, tweaks → new run under runs/). See references/experiment-setup.md for that disambiguation, hypothesis scoping, and the fast-iteration checklist.

Requirements

Python 3.11+ with uv or pixi for running scripts: uv run script.py or, when pixi is the environment manager, pixi run python script.py (pixi reads pyproject.toml or pixi.toml).
Dependencies declared via PEP723 inline script metadata in each script (or, with pixi, in pyproject.toml / pixi.toml).
Respect the user's training framework (PyTorch, JAX, TensorFlow, etc.). Run scripts in a GPU-enabled environment wherever possible: with uv use GPU-enabled deps (e.g. JAX GPU extras, PyTorch via [[tool.uv.index]] CUDA index in the script block); with pixi use a GPU-enabled environment defined in pyproject.toml or pixi.toml. Fall back to CPU only when GPU is unavailable. See references/script-patterns.md.

ml-experimentation

ML Experimentation

Usage

Requirements

What It Does