TechEcho

7 comments

king_magicalmost 6 years ago

I 100% agree with the premise of this, I certainly attest to run tracking being a critical component of my own personal model development/training/tuning pipeline. Not sure how I feel about using yet another 3rd party data science platform though - it'd be nice if the run tracking piece was just a component I could import into my notebook, and have it automagically track things for me (like... how seamless tqdm is for showing progress bars while iterating over loops).Regardless of my uncertainty around trying another 3rd party platform, the premise is spot on.

评论 #20567065 未加载

asdfman123almost 6 years ago

This is an ad, but they're not wrong.At my work, we use Azure ML Studio. I think the solution for deployment is to run some scripts to save the model information to git and automatically deploy from there. It will take a little bit of effort to set up, but I think it should work.

评论 #20567035 未加载

lukasalmost 6 years ago

I totally agree with this and I built wandb (wandb.com) to solve this problem. We try to do this in as lightweight a way as possible - for example we can do keras tracking with a single line (<a href="https://www.wandb.com/articles/visualize-keras-models-with-one-line-of-code" rel="nofollow">https://www.wandb.com/articles/visualize-keras-models-with-o...</a>) and pytorch with just a couple lines (<a href="https://www.wandb.com/articles/monitor-your-pytorch-models-with-five-extra-lines-of-code" rel="nofollow">https://www.wandb.com/articles/monitor-your-pytorch-models-w...</a>). Would love any feedback on it.

评论 #20568082 未加载

woeiruaalmost 6 years ago

I understand that for some people this might help, but frankly I find all of these "reproducibility" frameworks fall flat as soon as truly big data enters the picture. Data versioning is not sufficient, because I typically cannot roll back my datasets to a previous version (and we moved forward for a reason).Also, we are deliberately not using Databricks for this to avoid vendor lockin for something that will almost certainly be open-source soon.

评论 #20567461 未加载

评论 #20567693 未加载

gyre007almost 6 years ago

So run tracking as described here is about tracking every "variable" which comes into play when training your model?

评论 #20566340 未加载

andbbergeralmost 6 years ago

Seems like there's been an explosions of startups trying to win B2B dollars for this.There is an excellent open source project that nails this called sacred. It's not perfect, but it works, and as far as I can tell it has won the popularity contest.Please join me in using and contributing back to sacred!

评论 #20567845 未加载

visargaalmost 6 years ago

Does it also do hyper-parameter search? It's usually something you usually want to have.

评论 #20568244 未加载

7 comments

king_magicalmost 6 years ago

评论 #20567065 未加载

asdfman123almost 6 years ago

评论 #20567035 未加载

lukasalmost 6 years ago

评论 #20568082 未加载

woeiruaalmost 6 years ago

评论 #20567461 未加载

评论 #20567693 未加载

gyre007almost 6 years ago

So run tracking as described here is about tracking every "variable" which comes into play when training your model?

评论 #20566340 未加载

andbbergeralmost 6 years ago

评论 #20567845 未加载

visargaalmost 6 years ago

Does it also do hyper-parameter search? It's usually something you usually want to have.

评论 #20568244 未加载

Run tracking liberates ML teams

7 comments

Run tracking liberates ML teams

7 comments