TechEcho

6 comments

reapermanabout 2 years ago

I am very impressed with what Google has done for the state of machine learning infrastructure. I'm looking forward to future models based on OpenXLA which can run between Nvidia, Apple Silicon, and google's TPU's. My main limiter to using TPU more often is model compatibility. The TPU hardware is clearly the very best, just not always cost-effective for those of us who are starved of available engineering hours. OpenXLA may fix this if it lives up to its promise.That said, it's also incredible how fast things move in this space:> Midjourney, one of the leading text-to-image AI startups, have been using Cloud TPU v4 to train their state-of-the-art model, coincidentally also called “version four”.Midjourney is already on v5 as of the date of publication of this press release.

评论 #35463765 未加载

评论 #35464016 未加载

KeplerBoyabout 2 years ago

I refuse to care about them until they sell them on PCIe Cards.The lock-in is bad enough when dealing with niche hardware on-prem, i certainly won't deal with niche hardware in the cloud.

评论 #35462547 未加载

评论 #35465907 未加载

评论 #35462541 未加载

xiphias2about 2 years ago

,,Midjourney, one of the leading text-to-image AI startups, have been using Cloud TPU v4 to train their state-of-the-art model, coincidentally also called “version four''This sounds quite bad in a press release when Midjourney is at v5. Why did they move away?

评论 #35461384 未加载

评论 #35464005 未加载

tincoabout 2 years ago

They're so non confrontational. Their performance comparisons are against "CPU". Just come out and say it, even if it's not apples to apples. If the 3D-torus interconnect is so much better, just say how it compares to NVidia's latest and greatest. It's cool that midjourney committed to building on TPU, but I have a hard time betting my company on a technology that's so guarded that they won't even post a benchmark against their main competitor.

评论 #35461517 未加载

obblekkabout 2 years ago

This is very impressive technology and engineering.However, I remain a bit skeptical of the business case for TPUs for 3 core reasons:1) 100000x lower unit production volume than GPUs means higher unit costs2) Slow iteration cycle - these TPUv4 were launched in 2020. Maybe Google publishes one gen behind, but that would still be a 2-3 year iteration cycle from v3 to v4.3) Constant multiple advantage over GPUs - maybe 5-10x compute advantage over off the shelf GPU, and that number isn't increasing with each generation.It's cool to get that 5-10x performance over GPUs, but that's 4.5yrs of Moore's Law, and might already be offset today due to unit cost advantages.If the TPU architecture did something to allow fundamentally faster transistor density scaling, it's advantage over GPUs would increase each year and become unbeatable. But based on the TPUv3 to TPUv4 perf improvement over 3 years, it doesn't seem so.Apple's competing approach seems a bit more promising from a business perspective. The M1 unifies memory reducing the time commitment required to move data and switch between CPU and GPU processing. This allows advances in GPUs to continue scaling independently, while decreasing the user experience cost of using GPUs.Apple's version also seems to scale from 8GB RAM to 128GB meaning the same fundamental process can be used at high volume, achieving a low unit cost.Are there other interesting hardware for ML approaches out there?

评论 #35461478 未加载

评论 #35462426 未加载

评论 #35462446 未加载

评论 #35461328 未加载

评论 #35461375 未加载

评论 #35461348 未加载

评论 #35464299 未加载

评论 #35466512 未加载

ThorsBaneabout 2 years ago

This is really cool!

6 comments

reapermanabout 2 years ago

评论 #35463765 未加载

评论 #35464016 未加载

KeplerBoyabout 2 years ago

I refuse to care about them until they sell them on PCIe Cards.The lock-in is bad enough when dealing with niche hardware on-prem, i certainly won't deal with niche hardware in the cloud.

TPU v4 provides exaFLOPS-scale ML with efficiency gains

6 comments

TPU v4 provides exaFLOPS-scale ML with efficiency gains

6 comments