In my experience I’ve never found an instance where you would use Brier scores over cross entropy/Bernoulli/Binomial log likelihoods. Does anybody know a concrete example when you would prefer Brier??
For folks who want to try the kind of forecasting being discussed here, Metaculus is a pretty great community: <a href="https://www.metaculus.com/" rel="nofollow">https://www.metaculus.com/</a><p>Their FAQ has a great explanation of how they 'score' user forecasts --- including a summary of Brier scores for binary yes/no questions, and the log score used for both binary and continuous questions: <a href="https://www.metaculus.com/help/faq/#howscore" rel="nofollow">https://www.metaculus.com/help/faq/#howscore</a>
I'm not (yet) using a scoring rule for my work-in-progress uncertainty test[1] of calibration, but only Beta posteriors, which are also a neat way of presenting the result of many predictions.<p>I am slightly more fond of log scoring than the Brier score, though, for the reason mentioned in another comment: being somewhat wrong is often worse than being very right, and should be penalised harder numerically.<p>[1]: <a href="https://static.loop54.com/uncertainty-test.html" rel="nofollow">https://static.loop54.com/uncertainty-test.html</a><p>(By the way, I build this to practise myself -- but I ran into a problem: I know the answers to all propositions, having written them myself... if anyone wants to contribute propositions, please contact me and I'll ask for them in a specific format so I can blindly paste them without knowing the true ones.)