TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

XManager: A framework for managing machine learning experiments

4 pointsby Wookaiover 3 years ago

2 comments

Wookaiover 3 years ago
Google employee here, xmanger is one of the main ML experiment tracking&#x2F;orchestration tool we use internally, I&#x27;m pretty excited that it is now available for other to use!<p>In a nutshell, xmanager allows you to:<p>- define an experiment, which is a collection of one or more work units (think combination of hyperparamters)<p>- manage the different jobs&#x2F;executable required to run this experiment (TPU workers, tensorboard job, etc.)<p>- collect and display measurements from work units (loss, other metrics)<p>- keep a reproducible artifact which allows you to re-run the same experiment at any point in the future<p>See e.g. <a href="https:&#x2F;&#x2F;github.com&#x2F;deepmind&#x2F;xmanager&#x2F;blob&#x2F;main&#x2F;examples&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;deepmind&#x2F;xmanager&#x2F;blob&#x2F;main&#x2F;examples&#x2F;</a> for a few concrete examples of a launcher scripts.<p>I wish they had included screenshots of the tool itself in the repo, I&#x27;ll make that suggestion :).
dekhnover 3 years ago
It&#x27;s great this is open sourced. This technology was key to enabling ML folks to scale up computation without having to deal with borg and a bunch of other low-level systems.<p>It&#x27;s one of the few systems in ML that I&#x27;ve used and thought &quot;huh, this was well-designed and properly architected from the start&quot;