scikit-learn-machine-learning
Installation
SKILL.md
scikit-learn
Overview
scikit-learn is the standard Python library for classical machine learning. It provides consistent APIs for supervised learning (classification, regression), unsupervised learning (clustering, dimensionality reduction), model evaluation, and preprocessing, with seamless integration into NumPy/pandas workflows.
When to Use
- Building classification models for labeled data (spam detection, disease diagnosis, species identification)
- Predicting continuous outcomes with regression (price prediction, dose-response modeling)
- Clustering unlabeled data into groups (patient stratification, gene expression clusters)
- Reducing dimensionality for visualization or feature engineering (PCA, t-SNE on multi-omics data)
- Evaluating and comparing model performance with cross-validation
- Tuning hyperparameters systematically (grid search, random search)
- Building reproducible ML pipelines with preprocessing and modeling steps
- For deep learning tasks (images, NLP), use
pytorchortransformersinstead - For large-scale gradient boosting, use
xgboostorlightgbminstead