I've been working on a deep learning side project that needs GPU acceleration at inference time. My inference requests are very bursted, however, so I don't need a box to be lying around while waiting for requests. I can stand to wait for a small interval (<1min) before getting results back. Most of the options I'm seeing either assume full batch processing or persistent hosted boxes that lie around in between requests.<p>Is there anything like heroku dynos that can spin up to process a GPU compute request and then spin back down? What do you use for deep learning side projects?