I'm working on an AutoML system for tabular datasets. It is called MLJAR and is available as open-source with code on GitHub: https://github.com/mljar/mljar-supervised<p>I've compared my AutoML with other systems on 10 tabular datasets from Kaggle. The final result is the Percentile Rank in the Private Leaderboard (evaluated by Kaggle). The results other than MLJAR systems are from AutoGluon paper.<p><pre><code> Dataset Auto-WEKA auto-sklearn TPOT H2O AutoML GCP-Tables AutoGluon MLJAR
-------------- ----------- -------------- ------- ------------ ------------ ----------- -------
ieee-fraud 0.119 0.349 0.119 0.322 0.172
value 0.114 0.319 0.325 0.377 0.415 0.445
walmart 0.390 0.379 0.398 0.384 0.423
transaction 0.131 0.329 0.326 0.404 0.406 0.463
porto 0.158 0.331 0.315 0.406 0.434 0.462 0.540
allstate 0.124 0.310 0.237 0.352 0.74 0.706 0.764
mercedes 0.160 0.444 0.547 0.363 0.658 0.169 0.879
otto 0.145 0.717 0.597 0.729 0.821 0.988 0.924
satisfaction 0.235 0.408 0.495 0.740 0.763 0.823 0.975
bnp-paribas 0.193 0.412 0.460 0.417 0.440 0.986 0.986
</code></pre>
The higher the value, the better. The 1st place solution in the Kaggle competition will get Percentile Rank equal 1.0. You can see that some AutoML frameworks jump into the Top-10% of the competition (without any human help)!<p>I think that my AutoML system is quite advanced:<p>- it can generate new features with K-Means or Golden Features Search<p>- it has many ML algorithms available, can tune them and train (with early stopping if applicable), in selected time regime,<p>- can stack models in complex ensembles<p>- creates interpretations for ML models: SHAP plots, permutation-based importance, decision tree visualizations ...<p>- automatically generates documentation to Markdown or HTML (works like a dream in Jupyter notebook)<p>I hope that many data scientists will benefit from my AutoML system. I put a lot of effort into it.
The clickable url to GitHub repo of MLJAR AutoML: <a href="https://github.com/mljar/mljar-supervised" rel="nofollow">https://github.com/mljar/mljar-supervised</a>