xgboost-lightgbm
SKILL.md
XGBoost & LightGBM - Gradient Boosting for Tabular Data
XGBoost (eXtreme Gradient Boosting) and LightGBM (Light Gradient Boosting Machine) are the de facto standard libraries for machine learning on tabular/structured data. They consistently win Kaggle competitions and are widely used in industry for their speed, accuracy, and robustness.
When to Use
- Classification or regression on tabular data (CSVs, databases, spreadsheets).
- Kaggle competitions or data science competitions on structured data.
- Feature importance analysis and feature selection.
- Handling missing values automatically (no need to impute).
- Working with imbalanced datasets (built-in class weighting).
- Need for fast training on large datasets (millions of rows).
- Hyperparameter tuning with cross-validation.
- Ranking tasks (learning-to-rank algorithms).
- When you need interpretable feature importances.
- Production ML systems requiring fast inference on tabular data.