I’ve been working a lot with Bayes factors lately. I don’t want to sound cultish, but I think part of the issue is this stuff doesn’t work “half way”. As soon as you’re talking about the null hypothesis and Bayes factors, you’re mixing up two schools of thought that don’t play nice.<p>Bayes factors work with comparing models. There is no null model. What, 0% effect? Ok, there was a non-zero effect. That model loses since it put the probability of 0% at 1 and everything else at 0. And if you do <i>anything else</i>, you’re encoding some amount of belief into the model, some judgment you’ve made.<p>So, you need to pick two models and compare them. I’m not saying this is right for science. It’s working well for my purposes. One model meaning “as planned”, one model meaning “not as planned”, use the Bayes factor to decide if things are going as planned. But you do need to be explicit about what models you’re comparing. You have to be able to just put some data in and get a probability back, or it’s not going to work.<p>This is what makes this criticism of Bayes factors so unpersuasive. They’re very easy to calculate, but they’re never calculated here! It’s just the ratio of marginal likelihoods, the probability of the data under the model.
The statistical interpretation of observations is so subtle and complex that it's a good idea to assume that any publication from the empirical sciences is complete garbage, until you know for sure that a qualified statistician has supervised the process. A semester of "introduction to statistical methods" (which is all the background that most scientists have) is <i>NOT</i> enough.<p>Imagine a mathematician writing a paper on a medical topic, making all kinds of claims on how things work in the human body – and then that mathematician justifies their expertise by saying "I did a two-week first aid course once, and also, I was really good at biology in school". This is pretty much how lots of science operates when it comes to interpreting results mathematically.
I screwed around with trying to compute Bayes factors for models of distributions over set partitions, having been led astray by Bayesian phylogenetic inference methods. It was a waste of time--in practice the epistemology was terrible because the choice of prior distributions had such a huge effect on model comparisons. On top of that, the computations were highly unstable so I had to do a lot of fancy multi-temperature MCMC stuff that never quite worked.<p>Unless your priors are based on actual observations, stick with model selection approaches that are based on measured predictive power, or at least plausible approximations thereof, e.g. Aki Vehtari et al. LOO-CV (approximate leave-one-out cross-validation):<p><a href="https://avehtari.github.io/modelselection/" rel="nofollow noreferrer">https://avehtari.github.io/modelselection/</a><p><a href="https://mc-stan.org/loo/" rel="nofollow noreferrer">https://mc-stan.org/loo/</a>
> wait until you understand Bayes Factors<p>I'm not sure that piece will help people to understand Bayes Factors: <a href="https://statmodeling.stat.columbia.edu/2019/09/10/i-hate-bayes-factors-when-theyre-used-for-null-hypothesis-significance-testing/#comment-1119100" rel="nofollow noreferrer">https://statmodeling.stat.columbia.edu/2019/09/10/i-hate-bay...</a><p>> In social science, theory alone will not deliver one [hypothesis to test]<p>I guess it's difficult to test a hypothesis when you don't really have one.
Bayesian methods are not easy to use, I agree. But that's because they're trying to answer much more meaningful (but harder) questions than frequentist ones, which researchers <i>should</i> be trying to do. You can't ignore Bayesian epistemology just by not using Bayesian methods. The underlying considerations in the Bayesian framework will inevitably become relevant to how you interpret your data, whether or not you use a formal Bayesian method.<p>The thing is, formally, frequentist methods like Null Hypothesis Significance Testing don't tell you what you really want to know. If you get a significant p-value, that means the data you observed wouldn't often happen by chance (within your model of the null). This doesn't actually tell you if your particular hypothesis should be favored. That requires other considerations, including ones that Simonsohn is negative about in this article.<p>For example, Simonsohn's conclusion says:<p>> To use Bayes factors to test hypotheses: you need to be OK with the following two things:<p>> 1. Accepting the null when “the alternative” you consider, and reject, does not represent the theory of interest.<p>> 2. Rejecting a theory after observing an outcome that the theory predicts.<p>He implies that these should be points against Bayes factors. But #2 is something you actually <i>should</i> do sometimes. Demonstrably. If the data suggests a wildly implausible effect size that doesn't show up consistently in other analyses, that <i>should</i> be a point against your theory and in favor of some more mundane explanation, like noisy data from an underpowered study [1].<p>Not using Bayesian methods is understandable if you don't feel comfortable with the very heavy demands they can make on your statistics acumen. But if you're, say, a social scientist incentivized to get "sexy" results and you refuse to <i>engage</i> with Bayesian epistemology at all, your career will almost certainly just be a contribution of more noise publications to the replication crisis.<p>[1] <a href="http://www.stat.columbia.edu/~gelman/presentations/ziff.pdf" rel="nofollow noreferrer">http://www.stat.columbia.edu/~gelman/presentations/ziff.pdf</a>
If the minimum wage is increased $4, the competing explanations seem to be:<p>1. Change in unemployment is normally distributed with mean 0% and standard deviation 0.606%.<p>2. Change in unemployment is uniformly distributed between 1% and 10%.<p>I don't really agree that "(1) vs (2)" is a particularly good formulation of the original question ("Would raising the minimum wage by $4 lead to greater unemployment?"). But if it were, how would the math work out?<p>If we observe that unemployment increases 1%, then yes, that piece of evidence is very slightly in favor of explanation (1). This doesn't feel weird or paradoxical to me. But surely we wouldn't want to decide the matter based just on that one inconclusive data point? Instead we would want to look at another instance of the same situation. If in that case an increase of, say, 6% would (almost) conclusively settle the matter in favor of (2), and an increase of, say, 0.8% would (absolutely) conclusively settle the matter in favor of (1).
So you have just one data point and you want to do statistics about it? No matter what you do, the results won't be useful.<p>In Bayesian approach, you start with some distribution that is a wild guess and doesn't even need to base on any knowledge besides of the basics how money work and that unemployment cannot be 0% or 100%. Each data point will refine your distribution until at some dataset size, it will converge to something estimating the reality.<p>You might want to watch an amazingly helpful introduction by Richard McElreath here <a href="https://www.youtube.com/watch?v=guTdrfycW2Q">https://www.youtube.com/watch?v=guTdrfycW2Q</a>
> Note: By theory I merely mean the rationale for investigating the effect of x on y. A theory can be as simple as “I think people value a mug more once they own it”.<p>Hoo boy, the [2019] is well deserved on this one -- that's a dan arielly reference from before The 2021 Accusation and before the recent NPR story refuting his excuse[1].<p>[1]: <a href="https://www.npr.org/2023/07/27/1190568472/dan-ariely-francesca-gino-harvard-dishonesty-fabricated-data" rel="nofollow noreferrer">https://www.npr.org/2023/07/27/1190568472/dan-ariely-frances...</a>
p-values aren't problematic. How people use them is.<p>Same with bayes factors. I've seen people claim "anything above 3 is significant".<p>Incidentally, the theory behind p-values is actually beautiful, and p-values can generalise really well in theory, but in practice most people don't know this.<p>E.g., did you know that you can have "bayesian" p-values? (in the sense that the p-value can be designed to take priors and other models into account, without violating its definition in any way)
Milton Friedman was correct: because the true minimum wage is $0.00 (unemployment), he was correct to compare wage increase to the null hypothesis. The potshot in the opening paragraph ("Milton feels bad about the unemployed but good about his theory.") is simultaneously an appeal to emotion and a presumptuous ad hominem.