TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Machine Learning Reproducibility Checklist [pdf]

160 pointsby sonabinuover 5 years ago

4 comments

Abishek_Muthianover 5 years ago
This is a useful checklist, it reminded me of a recent topic of &#x27;Whether AI is the end of Scientific Method&#x27; on Babbage from Economist Radio[1]<p>The arguments were that in ML&#x2F;DL, experiments are run at large scale without hypothesis, with radical empiricism in a trial and error fashion which is against Scientific Method i.e. Hypothesis, Experiment, Observation, Theory.<p>[1]<a href="https:&#x2F;&#x2F;soundcloud.com&#x2F;theeconomist&#x2F;babbage-ai-the-end-of-the" rel="nofollow">https:&#x2F;&#x2F;soundcloud.com&#x2F;theeconomist&#x2F;babbage-ai-the-end-of-th...</a>
评论 #21890193 未加载
评论 #21889812 未加载
评论 #21893132 未加载
sillysaurusxover 5 years ago
This checklist has some flaws. Most interesting results in ML have no proof.<p>For example, can you give a proof of superconvergence? What’s the exact learning rate that causes it, and why? Did you know that you can often get away with a high learning rate for a time, and then divergence happens? What’s the proof of that?<p>Give a proof that under all circumstances and wind conditions, lowering your airplane’s flaps by 5 degrees will help you land safely.<p>Also, what about datasets that you’re not allowed to release? I personally despise such datasets, but I found myself in the ironic position of having a 10GB dataset dropped in my lap that was a perfect fit for my current project. Unfortunately it wasn’t until after training was mostly complete that we realized we hadn’t asked whether the author was comfortable releasing it, and indeed the answer was no. So what to do? Just don’t talk about it?<p>I guess the list is good as a set of ideals to aim for. I just wish some consideration was given that you often can’t meet all of those goals.<p>Most of OpenAI&#x27;s work would be excluded by this checklist. I don&#x27;t think anyone would argue that OpenAI doesn&#x27;t do important work, and that their results are in some sense reproducible.
评论 #21890615 未加载
评论 #21890533 未加载
评论 #21891154 未加载
评论 #21891606 未加载
评论 #21889567 未加载
评论 #21889971 未加载
YeGoblynQueenneover 5 years ago
This gives a little more context:<p><a href="https:&#x2F;&#x2F;www.nature.com&#x2F;articles&#x2F;d41586-019-03895-5" rel="nofollow">https:&#x2F;&#x2F;www.nature.com&#x2F;articles&#x2F;d41586-019-03895-5</a>
DrNukeover 5 years ago
This is aimed at production or critical applications, though, not forefront or blue-sky research. In the former case, we need a shared &amp; agreed framework to make sure everyone from everywhere gets statistically comparable results, with this checklist helping us in that sense. In the latter case, it is open field and we are looking for agreeable results approximation before a method, which will be devised later to fit concordant results.
评论 #21891649 未加载