TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Compare AutoML frameworks on 10 Tabular Kaggle competitions

2 点作者 pplonski86超过 4 年前
I&#x27;m working on an AutoML system for tabular datasets. It is called MLJAR and is available as open-source with code on GitHub: https:&#x2F;&#x2F;github.com&#x2F;mljar&#x2F;mljar-supervised<p>I&#x27;ve compared my AutoML with other systems on 10 tabular datasets from Kaggle. The final result is the Percentile Rank in the Private Leaderboard (evaluated by Kaggle). The results other than MLJAR systems are from AutoGluon paper.<p><pre><code> Dataset Auto-WEKA auto-sklearn TPOT H2O AutoML GCP-Tables AutoGluon MLJAR -------------- ----------- -------------- ------- ------------ ------------ ----------- ------- ieee-fraud 0.119 0.349 0.119 0.322 0.172 value 0.114 0.319 0.325 0.377 0.415 0.445 walmart 0.390 0.379 0.398 0.384 0.423 transaction 0.131 0.329 0.326 0.404 0.406 0.463 porto 0.158 0.331 0.315 0.406 0.434 0.462 0.540 allstate 0.124 0.310 0.237 0.352 0.74 0.706 0.764 mercedes 0.160 0.444 0.547 0.363 0.658 0.169 0.879 otto 0.145 0.717 0.597 0.729 0.821 0.988 0.924 satisfaction 0.235 0.408 0.495 0.740 0.763 0.823 0.975 bnp-paribas 0.193 0.412 0.460 0.417 0.440 0.986 0.986 </code></pre> The higher the value, the better. The 1st place solution in the Kaggle competition will get Percentile Rank equal 1.0. You can see that some AutoML frameworks jump into the Top-10% of the competition (without any human help)!<p>I think that my AutoML system is quite advanced:<p>- it can generate new features with K-Means or Golden Features Search<p>- it has many ML algorithms available, can tune them and train (with early stopping if applicable), in selected time regime,<p>- can stack models in complex ensembles<p>- creates interpretations for ML models: SHAP plots, permutation-based importance, decision tree visualizations ...<p>- automatically generates documentation to Markdown or HTML (works like a dream in Jupyter notebook)<p>I hope that many data scientists will benefit from my AutoML system. I put a lot of effort into it.

1 comment

pplonski86超过 4 年前
The clickable url to GitHub repo of MLJAR AutoML: <a href="https:&#x2F;&#x2F;github.com&#x2F;mljar&#x2F;mljar-supervised" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;mljar&#x2F;mljar-supervised</a>