ab-test-analysis
A/B Test Analysis
When to use
- An experiment has finished and the team needs a ship / no-ship recommendation
- Results look directionally positive but the team is unsure if they're statistically significant
- A test has been running for weeks without a clear winner and someone needs to decide whether to continue
- A new experiment needs sample-size planning before launch
- Results are disputed and need a rigorous, documented analysis
Process
- Confirm test design — verify the hypothesis, the control and treatment definitions, the randomisation unit (user/session/device), the primary metric, any guardrail metrics, and the target split ratio.
- Check for sample ratio mismatch (SRM) — run a chi-square test on the actual vs. expected split. If SRM is detected, stop and investigate the randomisation pipeline before interpreting results. Use
scripts/ab_test_analyzer.py --check-srm. - Calculate per-variant metrics — compute the rate (or mean) and 95% confidence interval for the primary metric in each variant. Document absolute and relative difference.
- Run the significance test — execute a two-proportion z-test (for rates) or Welch's t-test (for means). Record z-score, p-value, and 95% CI for the effect. Use
references/statistical_tests_reference.mdif unsure which test applies. - Check guardrail metrics — run the same significance test for each guardrail metric. A significant degradation on any guardrail is a blocker regardless of primary metric results.
- Produce the recommendation — synthesise SRM result, power, significance, and guardrail checks into a clear ship / no-ship / extend decision. Quantify the expected business impact if shipped. Record in
assets/ab_test_report_template.md.
Inputs the skill needs
- Test plan or hypothesis document (variant definitions, randomisation unit, primary metric)
More from nimrodfisher/data-analytics-skills
funnel-analysis
Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, comparing segments through a funnel, or optimizing user journeys.
45executive-summary-generator
Create concise executive summaries from detailed analysis. Use when preparing board decks, executive briefings, or condensing complex analysis into decision-ready formats for senior audiences.
41insight-synthesis
Transform data findings into compelling insights. Use when converting analysis results into actionable insights, connecting findings to business impact, or preparing insights for stakeholder communication.
41data-narrative-builder
Build compelling data-driven narratives. Use when presenting analysis results, creating stakeholder reports, or transforming a set of findings into a story that drives a specific decision or action.
40data-quality-audit
Comprehensive data quality assessment against business rules, schema constraints, and freshness expectations. Activate when validating data pipeline outputs before production use, auditing a dataset against defined business rules, or producing a quality scorecard for a data asset.
39time-series-analysis
Temporal pattern detection and forecasting. Use when analyzing trends over time, detecting seasonality, identifying anomalies in time series, or building simple forecasting models for planning.
39