TechEcho

13 comments

bhoustonalmost 2 years ago

Comparing MTIA v1 vs Google Cloud TPU v4:MTIA v1's specs: The accelerator is fabricated in TSMC 7nm process and runs at 800 MHz, providing 102.4 TOPS at INT8 precision and 51.2 TFLOPS at FP16 precision. It has a thermal design power (TDP) of 25 W. Up to 128 GB of ram LPDDR5.Googles Cloud TPU v4: 275 teraflops (bf16 or int8), 90/170/192 W. 32 GiB of HBM2 RAM, 1200 GBps. From here: <a href="https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#tpu_v4" rel="nofollow">https://cloud.google.com/tpu/docs/system-architecture-tpu-vm...</a>So it seems that the Google Cloud TPU v4 has an advantage in terms of compute per chip and ram speed, but the Meta one is much more efficient (2x to 4x, it is hard to tell) and has more ram but it is slower ram?

评论 #36001917 未加载

评论 #36001799 未加载

htrpalmost 2 years ago

This looks like a customized ASIC specializing solely in recommendation systems possibly focused on ads ranking>We found that GPUs were not always optimal for running Meta’s specific recommendation workloads at the levels of efficiency required at our scale. Our solution to this challenge was to design a family of recommendation-specific Meta Training and Inference Accelerator (MTIA) ASICs.

评论 #36003928 未加载

seydoralmost 2 years ago

It's curious why nobody is selling these systems yet

评论 #36001718 未加载

评论 #36002148 未加载

评论 #36004168 未加载

sebzim4500almost 2 years ago

Why does the headline just mention inference when the acronym also mentions training?Is it primarily for inference and the training is just an after thought?

评论 #36001454 未加载

评论 #36001430 未加载

gmm1990almost 2 years ago

They designed it in 2020 does that mean it is likely to have been in use for a while or is the design lag a few years?

评论 #36002947 未加载

rektidealmost 2 years ago

Can OpenXLA/IREE target it? Supposedly PyTorch 2.0's big shift was a switch to these new systems. Curiosity to know if that's actually happened here.Side note, the chip says Korea on it & I this expected it was Samsung... But it's TSMC made chips? What's up with that?

评论 #36005028 未加载

ramshankeralmost 2 years ago

>>>> fabricated in TSMC 7nm process and runs at 800 MHz, providing 102.4 TOPS at INT8 precision and 51.2 TFLOPS at FP16 precision. It has a thermal design power (TDP) of 25 W.So 2 generation of immediate improvement available.

评论 #36001348 未加载

评论 #36001255 未加载

notfriedalmost 2 years ago

Has there been any rumors or statements from Facebook on them eventually stepping into selling cloud compute? I'd be surprised if they are investing in building hardware accelerators just for their own services.

评论 #36001540 未加载

评论 #36003084 未加载

评论 #36001627 未加载

two_in_onealmost 2 years ago

I want one. This thing can run LLaMA 64b int8 easily.Meta is going to use it in datacenters, Much more efficient than NVidia generic GPUs. They are serious about putting AI everywhere.

brooksbpalmost 2 years ago

Why are there so many Mini SMP (?) connectors on the board? (video time 1:21)

villgaxalmost 2 years ago

Just missed FP8 implementation on hardware

tartavullalmost 2 years ago

How do they compare to TPUs?

0zemp2calmost 2 years ago

Just as incredible is the corresponding announcement of their RSC which is purportedly one of the world's most powerful clustersAmazing times! Private companies now have compute resources previously only showing up in government labs, and in many cases using novel components like MTIAThis feels like the start of a golden age and in a few years we will have incredible results and breakthroughs

13 comments

bhoustonalmost 2 years ago

评论 #36001917 未加载

评论 #36001799 未加载

htrpalmost 2 years ago

评论 #36003928 未加载

seydoralmost 2 years ago

It's curious why nobody is selling these systems yet

评论 #36001718 未加载

评论 #36002148 未加载

评论 #36004168 未加载

sebzim4500almost 2 years ago

Why does the headline just mention inference when the acronym also mentions training?Is it primarily for inference and the training is just an after thought?

评论 #36001454 未加载

评论 #36001430 未加载

gmm1990almost 2 years ago

They designed it in 2020 does that mean it is likely to have been in use for a while or is the design lag a few years?

评论 #36002947 未加载

rektidealmost 2 years ago

评论 #36005028 未加载

ramshankeralmost 2 years ago

评论 #36001348 未加载

评论 #36001255 未加载

notfriedalmost 2 years ago

评论 #36001540 未加载

评论 #36003084 未加载

评论 #36001627 未加载

two_in_onealmost 2 years ago

I want one. This thing can run LLaMA 64b int8 easily.Meta is going to use it in datacenters, Much more efficient than NVidia generic GPUs. They are serious about putting AI everywhere.

brooksbpalmost 2 years ago

Why are there so many Mini SMP (?) connectors on the board? (video time 1:21)

villgaxalmost 2 years ago

Just missed FP8 implementation on hardware

tartavullalmost 2 years ago

How do they compare to TPUs?

0zemp2calmost 2 years ago

MTIA v1: Meta’s first-generation AI inference accelerator

13 comments

MTIA v1: Meta’s first-generation AI inference accelerator

13 comments