TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

A Formula for Bayesian A/B Testing

93 pointsby bufoalmost 11 years ago

8 comments

sharnettalmost 11 years ago
Very nice, much faster than simulating it. I guess if you&#x27;re using an informative prior, instead of adding 1 to the number of successes and failures, you add the corresponding parameters of your beta prior?<p>A pretty good shortcut (if you don&#x27;t have a log-beta function, for example) is to approximate both A and B with the normal distribution. Then their difference is also normal, so you can just check the probability that a normal random variable is greater than zero.<p>Specifically, the mean μ of a beta distribution is α&#x2F;(α+β) and the variance σ^2 is αβ&#x2F;((α+β)^2(α+β+1)). Use these as the parameters for your normal approximation, and we have the difference D ~ N(μ_A-μ_B, σ_A^2+σ_B^2). The probability that B beats A is just the CDF of D evaluated at 0.<p>In Python:<p><pre><code> from scipy.stats import norm as norm def beta_mean(a, b): return a&#x2F;(a+b) def beta_var(a, b): return a*b&#x2F;((a+b)**2*(a+b+1)) def probability_B_beats_A(α_A, β_A, α_B, β_B): mu = beta_mean(α_A, β_A) - beta_mean(α_B, β_B) sigma = (beta_var(α_A, β_A) + beta_var(α_B, β_B))**.5 return norm.cdf(0, mu, sigma)</code></pre>
评论 #7849482 未加载
评论 #7848794 未加载
EvanMilleralmost 11 years ago
I&#x27;m pretty sure this formula is correct, but I haven&#x27;t seen it published anywhere. John Cook has some veiled references to a closed-form solution when one of the parameters is an integer:<p><a href="http://www.mdanderson.org/education-and-research/departments-programs-and-labs/departments-and-divisions/division-of-quantitative-sciences/research/biostats-utmdabtr-005-05.pdf" rel="nofollow">http:&#x2F;&#x2F;www.mdanderson.org&#x2F;education-and-research&#x2F;departments...</a> [pdf]<p>But he doesn&#x27;t really say what that closed form is, so I think his version must have been pretty hairy. (My version requires all four parameters to be integers, so I doubt we were talking about the same thing.)<p>Sadly I couldn&#x27;t get the math to work out for producing a confidence interval on |p_B - p_A| so for now you&#x27;re stuck with Monte Carlo for confidence bounds.<p>Thanks to prodding from Steven Noble over at Stripe, I&#x27;ll have another formula up soon for asking the same question using count data instead of binary data. Stay tuned!
评论 #7850882 未加载
评论 #7850315 未加载
评论 #7850640 未加载
评论 #7849119 未加载
评论 #7852102 未加载
jessaustinalmost 11 years ago
<i>If you are unable to find a log-gamma function lying around, rewrite the above equation in terms of factorials using Γ(z)=(z−1)!, and notice that there are an equal number of multiplicative terms in the numerator and denominator. If you alternately multiply and divide one term at a time, you should be able to arrive at an answer without encountering numerical overflow.</i><p>Something about the right side of the equation immediately preceding this quote seems to indicate that <i>many</i> of the terms in the numerator would cancel with equivalents in the denominator. I&#x27;m not really familiar with CAS systems, but is this the sort of thing they could do? Doing this simplification <i>once</i> when one writes the code seems to be a win over calculating the original expression every time the code runs.
评论 #7850443 未加载
评论 #7848590 未加载
paraschopraalmost 11 years ago
Evan, does this formula account for multiple comparisons (if you have multiple goals and multiple variations)? I guess it would suffer from the same problems that if you have 100 variations and 10 goals, some of them are bound to produce a significant result, just by random chance. Isn&#x27;t it? In classical testing, you can fix it by making your calculated alpha smaller than the real alpha, so you need much more data if there are multiple comparisons. What happens in Bayesian case?<p>Edit: I did some Googling and found this <a href="http://www.stat.columbia.edu/~gelman/research/unpublished/multiple2.pdf" rel="nofollow">http:&#x2F;&#x2F;www.stat.columbia.edu&#x2F;~gelman&#x2F;research&#x2F;unpublished&#x2F;mu...</a>
评论 #7850331 未加载
vogalmost 11 years ago
Great work! However, an additional <i>testing</i> section would be nice (right below the <i>implementation</i> section).<p>That section should provide two or three lists of example input values, and the expected output value (up to some accuracy).<p>As the author notes, although this formula looks pretty simple, you can make a lot of numerical mistakes when implementing it. A test suite would help implementors to catch those mistakes early.
评论 #7849309 未加载
hidden-markovalmost 11 years ago
Indeed, much faster than Monte Carlo integration:<p><pre><code> 5.778 evaluation.py:20(evaluate) # Sampling version 0.043 evaluation.py:64(evaluate) # Closed formula </code></pre> (See my A&#x2F;B testing library <a href="https://github.com/bogdan-kulynych/trials" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;bogdan-kulynych&#x2F;trials</a>)
spitfirealmost 11 years ago
Here&#x27;s the Mathematica version:<p>PrBbeatsA[\[Alpha]a_, \[Beta]a_, \[Alpha]b_, \[Beta]b_] := \!\( \<i>UnderoverscriptBox[\(\[Sum]\), \(i = 0\), \(\[Alpha]b\)] \</i>FractionBox[\(Beta[\[Alpha]a + i, \[Beta]b + \[Beta]a]\), \(\((\[Beta]b + i)\) Beta[ 1 + i, \[Beta]b] Beta[\[Alpha]a, \[Beta]a]\)]\)
psychometryalmost 11 years ago
A minor complaint: that formula calculates a probability (a number within the range [0,1]), rather than odds (a ratio of expected values).