ai-ml-data-science

Installation
SKILL.md

Data Science Engineering Suite - Quick Reference

This skill turns raw data and questions into validated, documented models ready for production:

  • EDA workflows: Structured exploration with drift detection
  • Feature engineering: Reproducible feature pipelines with leakage prevention and train/serve parity
  • Model selection: Baselines first; strong tabular defaults; escalate complexity only when justified
  • Evaluation & reporting: Slice analysis, uncertainty, model cards, production metrics
  • SQL transformation: SQLMesh for staging/intermediate/marts layers
  • MLOps: CI/CD, CT (continuous training), CM (continuous monitoring)
  • Production patterns: Data contracts, lineage, feedback loops, streaming features

Modern emphasis (2026): Feature stores, automated retraining, drift monitoring (Evidently), train-serve parity, and agentic ML loops (plan -> execute -> evaluate -> improve). Tools: LightGBM, CatBoost, scikit-learn, PyTorch, Polars (lazy eval for larger-than-RAM datasets), lakeFS for data versioning.


Quick Reference

Installs
230
GitHub Stars
61
First Seen
Jan 23, 2026
ai-ml-data-science — vasilyu1983/ai-agents-public