run-ab-test-models
Installation
SKILL.md
Run A/B Test for Models
See Extended Examples for complete configuration files and templates.
Execute controlled experiments comparing model versions using traffic splitting and statistical analysis.
When to Use
- Deploying new model version and want to validate improvement before full rollout
- Comparing multiple candidate models trained with different algorithms or features
- Testing impact of hyperparameter changes on business metrics
- Need to measure model performance in production without risking full traffic
- Regulatory requirements for gradual rollout (e.g., medical ML systems)
- Evaluating cost-performance tradeoffs between model sizes
Inputs
- Required: Champion model (current production version)
Related skills