TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The scikit-learn cargo cults

34 点作者 duckerude大约 4 年前

5 条评论

gleenn大约 4 年前
The author's beef seems to be "people use similar terminology across similar libraries/frameworks/platforms but they don't behave identically and represent subtly different things". Maybe I don't do enough data scienceing, but isn't this super common? Like, if I write a parser... I'd probably call the main function "parse", or if I'm writing a database connector, I'd probably call the function "connect" to do the connecting. I personally wouldn't expect those to work identically or mean the same exact abstraction. I personally love when things are named similarly so I can grok the meaning in a new codebase more quickly, even if things don't transfer identically.
评论 #26932273 未加载
rubatuga大约 4 年前
The author doesn’t really know what cargo cult is. It means doing things similar to other groups and expecting an unrealistically positive result. Not only do you have to prove that other ML libraries were imitating sklearn, but that copying it wasn’t useful. Like another commenter said, naming the functions: “fit” and “predict” are simply common names to easily convey meaning. It certainly has the positive effect of letting me know what the functions do. If that’s cargo culting, then so is any program that has a “main” or “init” function with different arguments. Also, to refute their last point, PyTorch is too low level to have a fit function, not because they aren’t trying to cargo cult.
huac大约 4 年前
SKL being first does not afford it a monopoly on ML object design. Nor should other libraries necessarily seek to emulate what came first (or support pickling...)
gyrovagueGeist大约 4 年前
Huh, didn’t think I’d see the writer of The Northern Caves on the top of HN.<p>Back to this post: I’ve written some nearest neighbor code and definitely felt some pressure to make the API sklearn compatible. But I don’t think it’s as bad as the post claims in practice.<p>Highly recommend checking out the posters other work. Its a lot of fun,
评论 #26930343 未加载
jwilber大约 4 年前
“ Sagemaker “Estimators” do not have anything to do with fitting or predicting anything. The SDK is not supplying you with any machine learning code here.”<p>The author is confusing the sagemaker service with the mxnet deep learning library (which sagemaker provides access to). Basically everything they wrote in that section is flat out incorrect.
评论 #26930126 未加载
评论 #26923340 未加载