Yo dawg, we heard you like transformers so we put transformers on your transformers so you can train while you train.
The spider web graph shows metatransformers performing worse to their counterparts in almost all fields. Is there a reason I should not believe that an expert model will always outperform a general purpose one, even if it's a metatransformer?
Yeah, that's where I thought it would go shortly after I tried GPT-4 from openAI. We're clearly at the transformer limits imho (comparing the effectiveness between 3.5 and 4, and the number of parameter in each model is why I think we reached a soft cap).<p>So since it'll be hard to go deeper, going broader by interlacing different model types might be a way to pierce through.
We need to start ingesting raw scientific data through these models and see what it comes up with. What could these models identify by parsing through raw JWST or Hubble data? Or training against every published scientific paper? Is anyone doing this sort of thing already?
Just few more steps like this, put it in a robot body, and Voilà , we have start of the first AI wars. How many centuries after this does the Butlerian Jihad start, lead by John Conner, of course?
According to the website, the model can then fine-tuned for certain tasks such as image classification.<p>1. How does the multi-model help here in improving the accuracy of image classification when training is combined from text, images, and audio?<p>2. How about the speed? I would imagine a model with text, audio and image data would be larger compared to text-only models?