> Another solution is dummy calculations, which run while there are no spikes, to smooth out demand. This makes the grid see a consistent load, but it also wastes energy doing unnecessary work.<p>Oh god...I can see it now. Someone will try to capitalize on the hype of LLMs and the hype of cryptocurrency and try to build a combined LLM training and cryptocurrency mining facility that that runs the mining between training spikes.
<p><pre><code> One solution is to rely on backup power supplies and batteries to charge and discharge, providing extra power quickly. However, much like a phone battery degrades after multiple recharge cycles, lithium-ion batteries degrade quickly when charging and discharging at this high rate.
</code></pre>
Is this really a problem for an industrial installation? I would imagine that a properly sized facility would have adequate cooling + capacity to only run the batteries within optimal spec. Solar plants are already charging/discharging their batteries daily.
What is causing demand bursts in AI workloads? I would have expected that AI training is almost the exact opposite. Load a minibatch, take a gradient step, repeat forever. But the article claims that "each step of the computation corresponds to a massive energy spike."
Or you simply use the pytorch.powerplant_no_blow_up operator [1]<p>[1] <a href="https://www.youtube.com/watch?v=vXsT6lBf0X4" rel="nofollow">https://www.youtube.com/watch?v=vXsT6lBf0X4</a>
Is that kind of load variation from large data centers really a problem to the power grid? There are much worse intermittent loads, such as an electric furnace or a rolling mill.
Wouldn't it be better to arrange the network and software to run the GPUs continuously at optimal usage?<p>Otherwise a lot of expensive GPU capital is idle between bursts of computation.<p>Didn't DeepSeek do something like this to get more system level performance out of less capable GPUs?
I am curious about what the load curves look like in these clusters. If the “networking gap” is long enough you might just be able to have a secondary workload that trains intermittently.<p>Slightly related, you can actually hear this effect depending on your GPU. It’s called coil whine. When your GPU is doing calculations, it draws more power and whines. Depending on your training setup, you can hear when it’s working. In other words, you want it whining all the time.
I wonder what voltage is used on the caps? The higher the voltage the greater the energy density(assuming the dielectric can handle it):<p>E = (CV^2)/2<p>where E is the stored energy, C is the capacitance, and V is the applied voltage
"Thousands of GPUs all linked together turning on and off at the same time." So supercapacitors allow for simpler software?, reduced latency? at a low cost?
Are these GPU DCs entirely passive cooled?<p>I'm surprised it's not cheaper to modulate all those compressor motors they presumably already have
Uhm, I was under impression you are <i>contractually</i> obliged not to do that to the grid? As a wholesale customer, not small kettle picker I mean. Or that's just my european bureaucratically minded approach and in the US everyone just rides the grid raw?
lmao the amount of weird fixes folks float for this problem is insane - tbh i feel like half of it really comes down to software folks not wanting to tweak their pipelines