Experimentation Tools
Educational tools to help you with planning, analyzing, & presenting your test results.
Special thanks to Merritt Aho for his work creating and updating these tools.
Standard Frequentist Sample Size Calculator
-
Calculate the estimated sample size and runtime based on the estimated lift percentage, i.e., the lift required to make a decision.
-
Baseline conversion rate
Traffic volume to your experience over 30 days
Number of variations or treatments
Number of tails (1 or 2)
Confidence level
Power level
-
Sample size
Runtime (if less than 30 days)
Error risk visualization
Results Analysis Tool: Binomial & Continuous Metrics
-
Analysis tool for analyzing results for your A/B tests. Use for binomial or continuous metrics* such as revenue. Allows customization of visuals with your screenshots and hypothesis statement.
-
Metric Type (Binomial or Continous)
*For continuous metrics, you'll need to enter standard deviations calculated separately
Number of Conversions for Control & Treatment
Amount of Traffic for Control & Treatment
Statistical Significance Threshold
Number of Tails (1 or 2)
Total p-values being calculated (ie - how many metrics, treatments, how many times have you peeked to make a decision, etc)
SRM check
Revenue projection inputs - approximate $ value of a conversion
Estimated conversions per month the audience provides
Report customization inputs
-
Visualizations of the Key Results
Confidence Intervals of the Differences
Calculations Table
Confidence Intervals of Variants Chart
6 Months Revenue Projections
Sequential Planning & Analysis Tool
-
Allows you to both plan for and analyze a frequentist-based, sequentially-designed experiment. Use after the runtime-based calculator.
-
Minimum detectable effect (required lift)
Base conversion rate
Confidence level
Power
1 or 2 tails
Amount of conversions to test area
Amount of traffic to test area
Number of days of traffic for those conversions & traffic numbers
Number of planned analyses (checkpoints)
-
Fixed-horizon sample size & runtime
Sequential maximum sample size & runtime
Maximum increase in runtime
Power analysis chart plot
Expected duration based on potential effect sizes and sample sizes
Decision Boundaries table for when you can safely “peek” with an associated chart displaying results you’ve entered at checkpoints showing when you cross binding or non-binding decision boundary
-
For more information on how to plan & analyze frequentist-sequential design tests, see this Q&A on the Analytics Toolkit written by Georgi Georgiev when Lucia Van de Brink chose to interview him to help explain the process.
Bayesian A/B Test Analysis Tool
-
Bayesian analysis tool for analyzing results for your A/B tests. Provides the probability that your treatment outperforms the control based on estimated priors using Monte Carlo simulations. (Includes an about Bayesian A/B testing section)
-
Historical traffic & conversions
$ value of one conversion
% change in the conversion rate that is negligible
How much $ the test would need to make you to justify implementation
Traffic & conversions for each variant
SRM check - % of traffic allocated to test variant
-
Variant A & B Conversion Rate
Observed Difference
Traffic Split
Variant Posteriors Chart
Effect Posterior Chart
Probabilities Explained
Probabilities Chart
Frequentist Runtime-Based Calculator
-
Calculate the minimum lift you’d need for a given runtime. So, if you know how much runtime you have, this helps you determine the lift you need. Perfect for decision-makers.
-
Baseline conversion rate
Traffic volume to your experience over 30 days
Number of variations or treatments
Number of tails (1 or 2)
Confidence level
Power level
-
Minimum lift
Runtime (if less than 30 days)
Error risk visualization