TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

On the pitfalls of A/B testing

24 pointsby loarakealmost 12 years ago

3 comments

kevinconroyalmost 12 years ago
tl;dr: Don&#x27;t bother with confidence intervals. Use a G-test instead.<p>Calculate it here: <a href="http://elem.com/~btilly/effective-ab-testing/g-test-calculator.html" rel="nofollow">http:&#x2F;&#x2F;elem.com&#x2F;~btilly&#x2F;effective-ab-testing&#x2F;g-test-calculat...</a><p>Read more here: <a href="http://en.wikipedia.org/wiki/G-test" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;G-test</a><p>And plain English here: <a href="http://en.wikipedia.org/wiki/Likelihood_ratio_test" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Likelihood_ratio_test</a>
cocoflunchyalmost 12 years ago
<p><pre><code> When A&#x2F;B testing, you need to always remember three things: The smaller your change is, the more data you need to be sure that the conclusion you have reached is statistically significant. </code></pre> Is that a mathematically provable result? It seems hard to conceptualize what a &#x27;small&#x27; or &#x27;big&#x27; change is. I would have expected another argument along the lines of &quot;If you make more than one change at a time, you are not going to be able to know which one of your changes caused the result&quot;.
评论 #6041715 未加载
评论 #6041836 未加载
评论 #6041847 未加载
RyanZAGalmost 12 years ago
I think the big issues people see in A&#x2F;B testing is because of a fairly tricky reason: the underlying distribution of the data. The usual ways of estimating how big your sample size are have one huge giraffe of a problem hiding in them: they assume the underlying distribution is normal.<p>The correct way to estimate your sample size is to use the cumulative distribution function of your underlying distribution. See a brief explanation from Wikipedia here: <a href="http://en.wikipedia.org/wiki/Sample_size_determination#By_cumulative_distribution_function" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Sample_size_determination#By_cu...</a><p>Now what&#x27;s the problem with A&#x2F;B testing? Most of the stuff we test A&#x2F;B for is incredibly non-normal. Often 99% of visits do not convert. We&#x27;re looking at extremely skewed data here. Generally the more skewed the distribution, the more samples we need.<p>For a very basic understanding of why: consider a very simple distribution with 99.99% of the time you get $0 and 0.01% of the time you get $29 - fairly similar to what we A&#x2F;B test. Do you think a sample of 1000 or 10000 is going to be anywhere near enough here? Of course not.
评论 #6041894 未加载