Ask HN: Has the LLM/transformer architecture hit its limit?

2 pointsby ianbutler6 months ago

Have we hit the limit for performance increases on the current architecture of LLMs?<p>I’ve heard some amount of agreement among professionals that yes we are, and with things like papers showing Chain of Thought isn’t a silver bullet it calls into question how valuable models like o1 are it slightly tilts my thinking as well.<p>What seems to be the consensus here?

4 comments

razodactyl6 months ago

I think there's still a lot of room "relatively" to move around but my current opinion is that hardware isn't where we would want it to ideally be to have next level LLMs everywhere.<p>We've seen the trend of distilling models at what seems to be the cost of more nuanced ability to iterate and achieve correct results.<p>I'm very convinced LLMs can go much further than we've achieved so far but I'm very welcoming of newer techniques that will improve accuracy / efficiency and adaptability.

评论 #42017995 未加载

ianbutler6 months ago

Maybe better expressed as are we in a tail end of an optimization phase where we’ll see long tail improvements but nothing generational

meiraleal6 months ago

Hopefully. Then we can start to develop solid software on top of it.

cranberryturkey6 months ago

i think o1 isworse than o4 for coding.

评论 #42017967 未加载