TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A Formula for Bayesian A/B Testing

93 点作者 bufo将近 11 年前

8 条评论

sharnett将近 11 年前
Very nice, much faster than simulating it. I guess if you&#x27;re using an informative prior, instead of adding 1 to the number of successes and failures, you add the corresponding parameters of your beta prior?<p>A pretty good shortcut (if you don&#x27;t have a log-beta function, for example) is to approximate both A and B with the normal distribution. Then their difference is also normal, so you can just check the probability that a normal random variable is greater than zero.<p>Specifically, the mean μ of a beta distribution is α&#x2F;(α+β) and the variance σ^2 is αβ&#x2F;((α+β)^2(α+β+1)). Use these as the parameters for your normal approximation, and we have the difference D ~ N(μ_A-μ_B, σ_A^2+σ_B^2). The probability that B beats A is just the CDF of D evaluated at 0.<p>In Python:<p><pre><code> from scipy.stats import norm as norm def beta_mean(a, b): return a&#x2F;(a+b) def beta_var(a, b): return a*b&#x2F;((a+b)**2*(a+b+1)) def probability_B_beats_A(α_A, β_A, α_B, β_B): mu = beta_mean(α_A, β_A) - beta_mean(α_B, β_B) sigma = (beta_var(α_A, β_A) + beta_var(α_B, β_B))**.5 return norm.cdf(0, mu, sigma)</code></pre>
评论 #7849482 未加载
评论 #7848794 未加载
EvanMiller将近 11 年前
I&#x27;m pretty sure this formula is correct, but I haven&#x27;t seen it published anywhere. John Cook has some veiled references to a closed-form solution when one of the parameters is an integer:<p><a href="http://www.mdanderson.org/education-and-research/departments-programs-and-labs/departments-and-divisions/division-of-quantitative-sciences/research/biostats-utmdabtr-005-05.pdf" rel="nofollow">http:&#x2F;&#x2F;www.mdanderson.org&#x2F;education-and-research&#x2F;departments...</a> [pdf]<p>But he doesn&#x27;t really say what that closed form is, so I think his version must have been pretty hairy. (My version requires all four parameters to be integers, so I doubt we were talking about the same thing.)<p>Sadly I couldn&#x27;t get the math to work out for producing a confidence interval on |p_B - p_A| so for now you&#x27;re stuck with Monte Carlo for confidence bounds.<p>Thanks to prodding from Steven Noble over at Stripe, I&#x27;ll have another formula up soon for asking the same question using count data instead of binary data. Stay tuned!
评论 #7850882 未加载
评论 #7850315 未加载
评论 #7850640 未加载
评论 #7849119 未加载
评论 #7852102 未加载
jessaustin将近 11 年前
<i>If you are unable to find a log-gamma function lying around, rewrite the above equation in terms of factorials using Γ(z)=(z−1)!, and notice that there are an equal number of multiplicative terms in the numerator and denominator. If you alternately multiply and divide one term at a time, you should be able to arrive at an answer without encountering numerical overflow.</i><p>Something about the right side of the equation immediately preceding this quote seems to indicate that <i>many</i> of the terms in the numerator would cancel with equivalents in the denominator. I&#x27;m not really familiar with CAS systems, but is this the sort of thing they could do? Doing this simplification <i>once</i> when one writes the code seems to be a win over calculating the original expression every time the code runs.
评论 #7850443 未加载
评论 #7848590 未加载
paraschopra将近 11 年前
Evan, does this formula account for multiple comparisons (if you have multiple goals and multiple variations)? I guess it would suffer from the same problems that if you have 100 variations and 10 goals, some of them are bound to produce a significant result, just by random chance. Isn&#x27;t it? In classical testing, you can fix it by making your calculated alpha smaller than the real alpha, so you need much more data if there are multiple comparisons. What happens in Bayesian case?<p>Edit: I did some Googling and found this <a href="http://www.stat.columbia.edu/~gelman/research/unpublished/multiple2.pdf" rel="nofollow">http:&#x2F;&#x2F;www.stat.columbia.edu&#x2F;~gelman&#x2F;research&#x2F;unpublished&#x2F;mu...</a>
评论 #7850331 未加载
vog将近 11 年前
Great work! However, an additional <i>testing</i> section would be nice (right below the <i>implementation</i> section).<p>That section should provide two or three lists of example input values, and the expected output value (up to some accuracy).<p>As the author notes, although this formula looks pretty simple, you can make a lot of numerical mistakes when implementing it. A test suite would help implementors to catch those mistakes early.
评论 #7849309 未加载
hidden-markov将近 11 年前
Indeed, much faster than Monte Carlo integration:<p><pre><code> 5.778 evaluation.py:20(evaluate) # Sampling version 0.043 evaluation.py:64(evaluate) # Closed formula </code></pre> (See my A&#x2F;B testing library <a href="https://github.com/bogdan-kulynych/trials" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;bogdan-kulynych&#x2F;trials</a>)
spitfire将近 11 年前
Here&#x27;s the Mathematica version:<p>PrBbeatsA[\[Alpha]a_, \[Beta]a_, \[Alpha]b_, \[Beta]b_] := \!\( \<i>UnderoverscriptBox[\(\[Sum]\), \(i = 0\), \(\[Alpha]b\)] \</i>FractionBox[\(Beta[\[Alpha]a + i, \[Beta]b + \[Beta]a]\), \(\((\[Beta]b + i)\) Beta[ 1 + i, \[Beta]b] Beta[\[Alpha]a, \[Beta]a]\)]\)
psychometry将近 11 年前
A minor complaint: that formula calculates a probability (a number within the range [0,1]), rather than odds (a ratio of expected values).