TechEcho

3 comments

kevinconroyalmost 12 years ago

tl;dr: Don't bother with confidence intervals. Use a G-test instead.Calculate it here: <a href="http://elem.com/~btilly/effective-ab-testing/g-test-calculator.html" rel="nofollow">http://elem.com/~btilly/effective-ab-testing/g-test-calculat...</a>Read more here: <a href="http://en.wikipedia.org/wiki/G-test" rel="nofollow">http://en.wikipedia.org/wiki/G-test</a>And plain English here: <a href="http://en.wikipedia.org/wiki/Likelihood_ratio_test" rel="nofollow">http://en.wikipedia.org/wiki/Likelihood_ratio_test</a>

cocoflunchyalmost 12 years ago

<pre><code> When A/B testing, you need to always remember three things: The smaller your change is, the more data you need to be sure that the conclusion you have reached is statistically significant. </code></pre> Is that a mathematically provable result? It seems hard to conceptualize what a 'small' or 'big' change is. I would have expected another argument along the lines of "If you make more than one change at a time, you are not going to be able to know which one of your changes caused the result".

评论 #6041715 未加载

评论 #6041836 未加载

评论 #6041847 未加载

RyanZAGalmost 12 years ago

I think the big issues people see in A/B testing is because of a fairly tricky reason: the underlying distribution of the data. The usual ways of estimating how big your sample size are have one huge giraffe of a problem hiding in them: they assume the underlying distribution is normal.The correct way to estimate your sample size is to use the cumulative distribution function of your underlying distribution. See a brief explanation from Wikipedia here: <a href="http://en.wikipedia.org/wiki/Sample_size_determination#By_cumulative_distribution_function" rel="nofollow">http://en.wikipedia.org/wiki/Sample_size_determination#By_cu...</a>Now what's the problem with A/B testing? Most of the stuff we test A/B for is incredibly non-normal. Often 99% of visits do not convert. We're looking at extremely skewed data here. Generally the more skewed the distribution, the more samples we need.For a very basic understanding of why: consider a very simple distribution with 99.99% of the time you get $0 and 0.01% of the time you get $29 - fairly similar to what we A/B test. Do you think a sample of 1000 or 10000 is going to be anywhere near enough here? Of course not.

On the pitfalls of A/B testing

3 comments

On the pitfalls of A/B testing

3 comments