MLflow Tracking

MLflow gives you experiment tracking, a model registry, and (since 2.14+) first-class LLM observability — all from one Python library + UI. Unlike DVC it does require a tracking backend (file / SQLite / server), but it gives you a real dashboard and multi-user collaboration in return.

This skill is opinionated about the three deployment modes that actually get used in practice, with a vendored production stack you can copy into any project. It defers to the official docs for everything else.

When to use

User wants to track ML experiments (params, metrics, artifacts) with a UI
User mentions mlflow.start_run, mlflow.log_metric, mlflow.set_tracking_uri, MLFLOW_TRACKING_URI, mlflow ui
User wants framework autologging (sklearn / PyTorch / Lightning / XGBoost / LightGBM / Keras / TensorFlow / Transformers / spark)
User wants LLM trace observability (OpenAI, Anthropic, LangChain, LlamaIndex, DSPy, AutoGen, CrewAI, etc.)
User wants to spin up a self-hosted tracking server with PostgreSQL + MinIO (production)
User wants a model registry with aliases (Champion / Challenger / Production)
User asks "how do I compare runs", "where do my logged params go", "how do I serve a logged model"

mlflow-tracking

MLflow Tracking

When to use

When NOT to use