It's not clear from the article whether it's a dense model or MoE. This matters when it comes to comparing with GPT-4 - in terms of # params - which is reported to be MoE
A lot of this was new to me, but it looks like Intel hopes to use this to demonstrate the linear scaling capacity of their Aurora nodes.<p>Argonne installs final components of Aurora supercomputer (22 June 2023): <a href="https://www.anl.gov/article/argonne-installs-final-components-of-aurora-supercomputer" rel="nofollow noreferrer">https://www.anl.gov/article/argonne-installs-final-component...</a><p>Aurora Supercomputer Blade Installation Complete (22 June 2023): <a href="https://www.intel.com/content/www/us/en/newsroom/news/aurora-supercomputer-blade-installation-complete.html" rel="nofollow noreferrer">https://www.intel.com/content/www/us/en/newsroom/news/aurora...</a><p>Intel® Data Center GPU Max Series, previously codename Ponte Vecchio (31 May 2023): <a href="https://www.intel.com/content/www/us/en/developer/articles/technical/intel-data-center-gpu-max-series-overview.html" rel="nofollow noreferrer">https://www.intel.com/content/www/us/en/developer/articles/t...</a>
The solution won't be just "bigger". A model with a trillion parameters will be more expensive to train and to run, but is unlikely to be better. Think of the early days of flight, you had biplanes; then you had triplanes. You could have followed that farther, and added more wings - but it wouldn't have improved things.<p>Improving AI will involve architectural changes. No human requires the amount of training data we are already giving the models. Improvements will make more efficient use of that data, and (no idea how - innovation required) allow them to generalize and reason from that data.
Is anything known about what extent if any non-public domain books are used for LLM’s?<p>One example is the Google books project made digital quite a few texts, but I’ve never heard if Google considers these fair game to train on for Bard.<p>Most of the copyright discussions I’ve seen have been around images and code but not much about books.<p>Seems to become more relevant as things scale up as indicated by this article.
It will be interesting to see what the government can do here. Can they use their powers to get their hands on the most data?<p>im still skeptical because new techniques are going to give an order of magnitude efficiency boost to transformer models, so 'just waiting' seems like the best approach for now. I dont think they will be able to just skip to the finish line by having the most money.
Haha, this is funny because everyone is talking about this as if it is designed to be like the LLMs we have access to.<p>The training parameters will be the databases of info scooped up and integrated into profiles of every person and their entire digital footprint, queriable and responsive to direct questioning
Ah yeah, this sounds like such a great thing, state of the art unreleased tech + 1 trillion parameters based by data accessed by the patriot act.<p>Such a wholesome thing. I don't want to hear 2 years from now how China is evil for using "AI" when the government is attempting to weaponize AI, of course other governments will start doing it as well.
Mistral 7B parameter models are quite good<p>Already fine tuned and conversational<p>its like education is more important than needing a trillion parameter brainiac
Not surprising. Despite the enormous energy costs and the threat to humanity by creating technology that we can't control, governments and corporations will build bigger and more sophisticated models just because they have to do so to compete.<p>It is the prisoner's dilemma: we end up with a super-advanced AI that will disrupt and make society worse because entities are competing where the metric of success is short-term monetary gain.<p>It's ridiculous. Humanity should give up this useless development of AI.