TechEcho

Projects like this are inevitable and necessary; 'OpenAI' make such a mockery of their name that it's an invitation to others to try and build an alternative that is actually open.

There is a recent Yannic Kilcher interview about LAION.<p>> LAION-5B: 5 billion image-text-pairs dataset (with the authors)<p><a href="https://www.youtube.com/watch?v=AIOE1l1W0Tw" rel="nofollow">https://www.youtube.com/watch?v=AIOE1l1W0Tw</a><p>A nice recent result (DeepMind) is that you can either make the dataset 4x larger or the network 4x larger to get the same result. So a large dataset could create a more efficient/smaller model and in turn it could be easier to distribute and use.<p><a href="https://www.deepmind.com/publications/an-empirical-analysis-of-compute-optimal-large-language-model-training" rel="nofollow">https://www.deepmind.com/publications/an-empirical-analysis-...</a>

Their marketing is so bad. Terrible website, they present themselves first by opposing OpenAI, they name their datasets the way established orgs name their models. Their only project is a non-curated filtering of already open source data using CLIP (they just looped over it and dropped the image-text pairs with cosine similarity below 0.3).

Hmm, some portions of their data sets seem to be under non-commercial licenses.

Does this organization work with the EleutherAI team?

Projects like this are inevitable and necessary; 'OpenAI' make such a mockery of their name that it's an invitation to others to try and build an alternative that is actually open.

Hmm, some portions of their data sets seem to be under non-commercial licenses.

Does this organization work with the EleutherAI team?

Large-Scale Artificial Intelligence Open Network

5 comments

Large-Scale Artificial Intelligence Open Network

5 comments