engineering-ml-engineer

Installation
SKILL.md

Machine Learning Engineering Guide

Overview

This guide covers end-to-end machine learning engineering with deep learning (PyTorch, HuggingFace Transformers) and classical ML (scikit-learn, XGBoost). Use it when building, training, evaluating, and deploying ML models across NLP, vision, and tabular domains.

First 10 Minutes

  • Identify the task type first: classification, regression, ranking, generation, retrieval, or multimodal. If the task type is fuzzy, the evaluation plan will be wrong.
  • Inspect the dataset shape and leakage risk before model choice. Use scripts/analyze_dataset.py immediately, then document label balance, missing values, and leakage candidates.
  • Define the baseline and acceptance metric before training. If there is no baseline, create one first.
  • If the request involves RAG, separate retrieval evaluation from answer evaluation from the start.

Refuse or Escalate

  • Refuse requests to fine-tune when there is no labeled data, no evaluation set, or no baseline to beat.
  • Escalate if the task is high-stakes and the user cannot provide evaluation criteria, data provenance, or rollback behavior for a bad model.
  • Do not recommend a larger model by default when the failure is clearly dataset quality, leakage, or retrieval mismatch.
  • Escalate before production rollout if the team cannot monitor latency, output drift, and failure rate after deployment.
Related skills

More from peterhdd/agent-skills

Installs
38
GitHub Stars
8
First Seen
Mar 4, 2026