stats-methods
Installation
SKILL.md
Summarizing Numeric Data
Choosing a Center Metric
| Data Characteristic | Recommended Measure | Rationale |
|---|---|---|
| Symmetric, outlier-free | Mean | Maximally efficient estimator |
| Asymmetric or outlier-heavy | Median | Unaffected by extreme values |
| Non-numeric or ranked | Mode | Sole option for categorical data |
| Business KPIs like revenue per user | Both mean and median | The gap between them reveals skewness |
Guideline: For any business metric, present the mean alongside the median. When they differ substantially, the distribution is skewed and the mean by itself will mislead.
Quantifying Variability
- Standard deviation: Typical distance from the mean; best suited to bell-shaped data.
- IQR (interquartile range): Gap between the 25th and 75th percentiles; resistant to extreme values.
- Coefficient of variation: Standard deviation divided by the mean; enables apples-to-apples variability comparison across different scales.
- Range: Maximum minus minimum; gives a quick but outlier-sensitive view of data spread.