ML Experiment Tracker

A skill for planning, executing, and tracking machine learning experiments with full reproducibility. Covers experiment design, hyperparameter management, metric logging, model versioning, and comparison across runs to support rigorous ML research.

Overview

Machine learning research involves running dozens or hundreds of experiments with varying architectures, hyperparameters, data splits, and preprocessing pipelines. Without systematic tracking, it becomes impossible to reproduce results, compare configurations, or identify which changes actually improved performance. This skill provides a structured methodology for experiment management that aligns with academic standards for reproducible ML research.

The approach is framework-agnostic but demonstrates integration with MLflow, Weights & Biases, and plain file-based logging. It emphasizes the practices needed for publications: complete hyperparameter documentation, statistical significance testing across runs, and artifact management for model checkpoints and evaluation outputs.

Experiment Design Framework

Defining an Experiment Plan

Before writing any training code, document the experiment plan:

# experiment_plan.yaml
experiment:

Related skills

More from wentorai/research-plugins

Installs

Repository

wentorai/resear…-plugins

GitHub Stars

217

First Seen

Apr 2, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

ml-experiment-tracker

ML Experiment Tracker

Overview

Experiment Design Framework

Defining an Experiment Plan

More from wentorai/research-plugins

academic-paper-summarizer

academic-translation-guide

academic-writing-refiner

academic-citation-manager

abstract-writing-guide

ai-writing-humanizer