Transformers on Chips

94 pointsby vasinovover 1 year ago

19 comments

gubertiover 1 year ago

Founder here!We're still in stealth, but I'll be able to share details and performance figures soon.Our first product is a bet on transformers. If we're right, there's enormous upside - being transformer-specific lets you get an order of magnitude more compute than more flexible accelerators (GPUs, TPUs).We're hiring - if the EV makes sense for you, reach out at gavin @ etched.ai

评论 #38670139 未加载

评论 #38670081 未加载

评论 #38677110 未加载

评论 #38670788 未加载

评论 #38670158 未加载

评论 #38670473 未加载

WhitneyLandover 1 year ago

I am not buying this at all. But I’m not a hardware guy so maybe someone can help with why this is not true:- Crypto hardware needed SHA256 which is basically tons of bitwise operations. That’s way simpler than the tons of matrix ops transformers need.- NVidia wasn’t focused on crypto acceleration as a core competency. There are focussed on this, and are already years down the path.- One of the biggest bottlenecks is memory bandwidth. That is also not cheap or simple to do.- Say they do have a great design. What process are they going to build it on? There are some big customers out there waiting for TMSC space already.Maybe they have IP and it’s more of a patent play.(I mention crypto only as an example of custom hardware competing with a GPU)

评论 #38669824 未加载

评论 #38669632 未加载

评论 #38675505 未加载

评论 #38670068 未加载

duskwuffover 1 year ago

Title was a bit of a letdown. I was hoping for a discussion of silicon planar transformers (like, the electrical component), which are of increasing interest in RF ICs. :)

评论 #38669821 未加载

andy99over 1 year ago

There is a lot going on in the LLM / AI chip space. Most of the big players are focusing on general purpose AI chips, like Cerebras and Untether. This - what I understand to be more like ASICs is an interesting market. They give up flexibility but presumably can make them more cheaply. There is also Positron AI in this space, mentioned here: <a href="https://news.ycombinator.com/item?id=38601761">https://news.ycombinator.com/item?id=38601761</a>I'm only peripherally aware of ASICs for bitcoin mining, I have no idea the economics or cycle times. It would be interesting to see a comparison between bitcoin mining chips and AI.One thing I wonder about is that all of AI is very forward looking, ie anticipating there will be applications to warrant building more infrastructure. It may be a tougher sell to convince someone they need to buy a transformer inference chip now as opposed to something more flexible they'll use in an imagined future.

评论 #38669413 未加载

评论 #38669397 未加载

OJFordover 1 year ago

Where did this come from? There is absolutely nothing clickable except 'contact us' which just reloads the same page? There's almost zero information here?

评论 #38670002 未加载

krasinover 1 year ago

My comment is about the general idea (LLM transformers on a chip), not particular company, as I have no insight into the latter.Such a chip (with support for LoRA finetuning) would likely be the enabler for the next-gen robotics.Right now, there is a growing corpus of papers and demos that show what's possible, but these demos are often a talk-to-a-datacenter ordeal, which is not suitable for any serious production use: too high latency, too much dependency on the Internet.With a low-latency, cost- and energy-efficient way to run finetuned LLMs locally (and keep finetuning based on the specific robot experience), we can actually make something useful in the real world.

bigdictover 1 year ago

Product page like this... they haven't even designed the chip. Complete vaporware.

评论 #38669818 未加载

rvzover 1 year ago

This only tells me we are at peak AI hype, given that products like this have to dress up ASICs as 'Transformers on Chips' or 'Transformer Supercomputer'.As always, no technical reports or in-depth benchmarks other than a unlabelled chart comparing against Nvidia H100s with little context and marketing jargon to the untrained eye.It seems that this would tie you into a specific neural net implementation (i.e llama.cpp as a ASIC) and would have to require a hardware design change to support another.

masterofsomeover 1 year ago

Isn't this kinda pigeonholing yourself to one neural network architecture? Are we sure that transformers will take us to the promised land? Chip design is a pretty expensive and time consuming process, so if a new architecture comes out that is sufficiently different from the current transformer model wouldn't they have to design a completely new chip? The compute unit design is probably similar from architecture to architecture, so maybe I am misunderstanding...

评论 #38669418 未加载

评论 #38669444 未加载

评论 #38669410 未加载

teaearlgraycoldover 1 year ago

Could probably go even faster burning GPT-4's weights right into the silicon. No need to even load weights into memory.Granted, that eliminates the ability to update the model. But if you already have a model you like that's not a problem.

qeternityover 1 year ago

Yeah I call BS on this. This does nothing to address the main issues with autoregressive transformer models (memory bandwidth).GPU compute units are mostly sitting idle these days waiting for chip cache to receive data fr VRAM.This does nothing to solve that.

评论 #38670073 未加载

评论 #38670028 未加载

ilakshover 1 year ago

Wow. I wish I could get a computer or VM/VPS with this. Or rent part of one. Use it with quantized models and llama.cpp.Seems like a big part of using these systems effectively is thinking of ways to take advantage of batching. I guess the normal thing is just to handle multiple user's requests simultaneously. But maybe another one could be moving from working with agents to agent swarms.

评论 #38669107 未加载

andy_xor_andrewover 1 year ago

interesting how MCTS decoding is called out. that seems entirely like a software aspect, which doesn't depend on a particular chip design?and on the topic of MCTS decoding, I've heard lots of smart people suggest it, but I've yet to see any serious implementation of it. it seems like such an obviously good way to select tokens, you'd think it would be standard in vllm, TGI, llama.cpp, etc. But none of them seem to use it. Perhaps people have tried it and it just don't work as well as you would think?

评论 #38669386 未加载

评论 #38670325 未加载

adriangrigoreover 1 year ago

I don't have a good scrollwheel, not easy to browse the site. :(

评论 #38670071 未加载

nojvekover 1 year ago

How expensive will this be?100T models on one chip with MCTS search.That is some impressive marketing.I’ll believe it when I see it.Great to see so many hardware startups.Future is deffo accelerated neural nets on hardware.

jwenigover 1 year ago

Given that you believe the transformer is the future, this could flip the state of latency & cost to run these models overnight.

mynameisnooneover 1 year ago

Nonfunctional requirement: Decepticon logo in the chip art. It can't hurt and always adds 10 HP.

jadboxover 1 year ago

Wake me up when I get buy gpt4 as a dedicated chip etch to use as a realtime personal copilot.

29athrowawayover 1 year ago

What about Transformers on FPGAs?

评论 #38669507 未加载

评论 #38669864 未加载