ml-experiment-tracker

Installation
SKILL.md

ML Experiment Tracker

A skill for planning, executing, and tracking machine learning experiments with full reproducibility. Covers experiment design, hyperparameter management, metric logging, model versioning, and comparison across runs to support rigorous ML research.

Overview

Machine learning research involves running dozens or hundreds of experiments with varying architectures, hyperparameters, data splits, and preprocessing pipelines. Without systematic tracking, it becomes impossible to reproduce results, compare configurations, or identify which changes actually improved performance. This skill provides a structured methodology for experiment management that aligns with academic standards for reproducible ML research.

The approach is framework-agnostic but demonstrates integration with MLflow, Weights & Biases, and plain file-based logging. It emphasizes the practices needed for publications: complete hyperparameter documentation, statistical significance testing across runs, and artifact management for model checkpoints and evaluation outputs.

Experiment Design Framework

Defining an Experiment Plan

Before writing any training code, document the experiment plan:

# experiment_plan.yaml
experiment:
Related skills
Installs
2
GitHub Stars
217
First Seen
Apr 2, 2026