TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Meta-Transformer: A unified framework for multimodal learning

106 pointsby ulrikhansen54almost 2 years ago

6 comments

kristjankalmost 2 years ago
Yo dawg, we heard you like transformers so we put transformers on your transformers so you can train while you train. The spider web graph shows metatransformers performing worse to their counterparts in almost all fields. Is there a reason I should not believe that an expert model will always outperform a general purpose one, even if it's a metatransformer?
评论 #36852343 未加载
评论 #36852895 未加载
评论 #36853564 未加载
评论 #36853288 未加载
评论 #36852886 未加载
评论 #36852589 未加载
评论 #36853496 未加载
orwinalmost 2 years ago
Yeah, that&#x27;s where I thought it would go shortly after I tried GPT-4 from openAI. We&#x27;re clearly at the transformer limits imho (comparing the effectiveness between 3.5 and 4, and the number of parameter in each model is why I think we reached a soft cap).<p>So since it&#x27;ll be hard to go deeper, going broader by interlacing different model types might be a way to pierce through.
评论 #36852924 未加载
ccheneyalmost 2 years ago
We need to start ingesting raw scientific data through these models and see what it comes up with. What could these models identify by parsing through raw JWST or Hubble data? Or training against every published scientific paper? Is anyone doing this sort of thing already?
评论 #36852504 未加载
FrustratedMonkyalmost 2 years ago
Just few more steps like this, put it in a robot body, and Voilà , we have start of the first AI wars. How many centuries after this does the Butlerian Jihad start, lead by John Conner, of course?
Orasalmost 2 years ago
According to the website, the model can then fine-tuned for certain tasks such as image classification.<p>1. How does the multi-model help here in improving the accuracy of image classification when training is combined from text, images, and audio?<p>2. How about the speed? I would imagine a model with text, audio and image data would be larger compared to text-only models?
ImHereToVotealmost 2 years ago
This seems like a step in the dangerous direction.
评论 #36852719 未加载
评论 #36853319 未加载
评论 #36852054 未加载