TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Building a powerful ML workstation vs. consuming it via cloud providers?

2 pointsby tzhenghaoalmost 7 years ago
I can&#x27;t decide between buying hardware, assembling and building a &quot;gaming&quot; rig for deep learning applications, and consuming compute resources via a cloud provider? Which is a more economical method?<p>I&#x27;d prefer the latter since I won&#x27;t have to worry about maintaining hardware, but I heard the prices are pretty high and there&#x27;s slowdown on some AWS GPU instances? Thoughts?

2 comments

Eridrusalmost 7 years ago
How much are you going to use your workstation for ML&#x2F;games?<p>If you&#x27;re just getting started and don&#x27;t have a specific application in mind you should just play with some smaller problems, use Colab and&#x2F;or use the $300 in GCP credits Google keeps advertising.<p>If you&#x27;re going to be doing this for an extended amount of time, then buying hardware will be better; you can get some good deals on second hand high end GPUs on eBay.<p>The main risk is that you buy a bunch of gear, and then don&#x27;t use it.
selljamherealmost 7 years ago
TL;DR - Building your own rig is fun, but can be a pain to maintain. There is a path on Google Cloud to start ML projects for free, and scale up when you need to.<p>I think you&#x27;re hitting the biggest up&#x2F;down sides on the nose.<p>There is an innate joy to building your own rig (I&#x27;m guilty of it myself), and it does provide a fair amount of flexibility. The two main downsides that come to mind are the up front cost for parts, and the maintenance required to keep the system up to date. I&#x27;ve had success with a CentOS rig running nvidia-docker[1]. The annoyances mount over time though, as OS and Nvidia driver releases can be a pain to keep up to date, especially if you&#x27;re working with several of the major frameworks. This is the main reason I&#x27;ve moved most of my ML projects to the cloud.<p>Cloud has many benefits including scalability and minimal maintenance. There are many options in the cloud ML space, but my current favorite leverages Google&#x27;s cloud. I&#x27;ll get into detail below, but the main reason is that they have several tools that build a natural progression and will grow with you as your projects mature.<p>First Steps:<p>I&#x27;d recommend checking out Colab[2][3] to get your feet wet. It includes a FREE GPU instance, and is a Jupyter based notebook, so you can easily utilize any of the python based frameworks. (1&#x2F;2 of a K80, if memory serves. 12 GB) As with all free resources, there are some caveats. Colab is designed for interactive usage, so they don&#x27;t encourage long running processes. However, the general consensus online is that you can get around 12 hours of training in at a time before the instance restarts itself, which is plenty of time to kick off a job and save checkpoints to cloud storage for resuming later. All in all, it&#x27;s a great tool get a project off the ground. One bit of strangeness (from a software engineer&#x27;s perspective): the notebooks are stored in google drive. This adds an interesting possibility for collaboration (it&#x27;s in the name, right?), but leaves something to be wanted in terms of source control.<p>Growth:<p>If Colab isn&#x27;t cutting it, there are a couple next steps. If you need more compute power, or have long running training jobs, the Colab UI can actually be run locally backed by your own GCE instance (with attached GPUs)[4]. If you need additional services (databases, map&#x2F;reduce, etc) Datalab[5] is Colab&#x27;s bigger beefier cousin, with all sorts of integrations across the Google Cloud Platform, as well as 3rd party providers.<p>EDIT: Link formatting<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;NVIDIA&#x2F;nvidia-docker" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;NVIDIA&#x2F;nvidia-docker</a><p>[2] <a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;notebooks&#x2F;welcome.ipynb#recent=true" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;notebooks&#x2F;welcome.ipynb#re...</a><p>[3] <a href="https:&#x2F;&#x2F;research.google.com&#x2F;colaboratory&#x2F;faq.html" rel="nofollow">https:&#x2F;&#x2F;research.google.com&#x2F;colaboratory&#x2F;faq.html</a><p>[4] <a href="https:&#x2F;&#x2F;research.google.com&#x2F;colaboratory&#x2F;local-runtimes.html" rel="nofollow">https:&#x2F;&#x2F;research.google.com&#x2F;colaboratory&#x2F;local-runtimes.html</a><p>[5] <a href="https:&#x2F;&#x2F;cloud.google.com&#x2F;datalab&#x2F;" rel="nofollow">https:&#x2F;&#x2F;cloud.google.com&#x2F;datalab&#x2F;</a>