TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Run tracking liberates ML teams

35 pointsby lewqalmost 6 years ago

7 comments

king_magicalmost 6 years ago
I 100% agree with the premise of this, I certainly attest to run tracking being a critical component of my own personal model development&#x2F;training&#x2F;tuning pipeline. Not sure how I feel about using yet another 3rd party data science platform though - it&#x27;d be nice if the run tracking piece was just a component I could import into my notebook, and have it automagically track things for me (like... how seamless tqdm is for showing progress bars while iterating over loops).<p>Regardless of my uncertainty around trying another 3rd party platform, the premise is spot on.
评论 #20567065 未加载
asdfman123almost 6 years ago
This is an ad, but they&#x27;re not wrong.<p>At my work, we use Azure ML Studio. I think the solution for deployment is to run some scripts to save the model information to git and automatically deploy from there. It will take a little bit of effort to set up, but I think it should work.
评论 #20567035 未加载
lukasalmost 6 years ago
I totally agree with this and I built wandb (wandb.com) to solve this problem. We try to do this in as lightweight a way as possible - for example we can do keras tracking with a single line (<a href="https:&#x2F;&#x2F;www.wandb.com&#x2F;articles&#x2F;visualize-keras-models-with-one-line-of-code" rel="nofollow">https:&#x2F;&#x2F;www.wandb.com&#x2F;articles&#x2F;visualize-keras-models-with-o...</a>) and pytorch with just a couple lines (<a href="https:&#x2F;&#x2F;www.wandb.com&#x2F;articles&#x2F;monitor-your-pytorch-models-with-five-extra-lines-of-code" rel="nofollow">https:&#x2F;&#x2F;www.wandb.com&#x2F;articles&#x2F;monitor-your-pytorch-models-w...</a>). Would love any feedback on it.
评论 #20568082 未加载
woeiruaalmost 6 years ago
I understand that for some people this might help, but frankly I find all of these &quot;reproducibility&quot; frameworks fall flat as soon as truly big data enters the picture. Data versioning is not sufficient, because I typically cannot roll back my datasets to a previous version (and we moved forward for a reason).<p>Also, we are deliberately not using Databricks for this to avoid vendor lockin for something that will almost certainly be open-source soon.
评论 #20567461 未加载
评论 #20567693 未加载
gyre007almost 6 years ago
So run tracking as described here is about tracking every &quot;variable&quot; which comes into play when training your model?
评论 #20566340 未加载
andbbergeralmost 6 years ago
Seems like there&#x27;s been an explosions of startups trying to win B2B dollars for this.<p>There is an excellent open source project that nails this called sacred. It&#x27;s not perfect, but it works, and as far as I can tell it has won the popularity contest.<p>Please join me in using and contributing back to sacred!
评论 #20567845 未加载
visargaalmost 6 years ago
Does it also do hyper-parameter search? It&#x27;s usually something you usually want to have.
评论 #20568244 未加载