TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Large-Scale Artificial Intelligence Open Network

74 pointsby btdmasterabout 3 years ago

5 comments

Stevvoabout 3 years ago
Projects like this are inevitable and necessary; 'OpenAI' make such a mockery of their name that it's an invitation to others to try and build an alternative that is actually open.
评论 #31151223 未加载
visargaabout 3 years ago
There is a recent Yannic Kilcher interview about LAION.<p>&gt; LAION-5B: 5 billion image-text-pairs dataset (with the authors)<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=AIOE1l1W0Tw" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=AIOE1l1W0Tw</a><p>A nice recent result (DeepMind) is that you can either make the dataset 4x larger or the network 4x larger to get the same result. So a large dataset could create a more efficient&#x2F;smaller model and in turn it could be easier to distribute and use.<p><a href="https:&#x2F;&#x2F;www.deepmind.com&#x2F;publications&#x2F;an-empirical-analysis-of-compute-optimal-large-language-model-training" rel="nofollow">https:&#x2F;&#x2F;www.deepmind.com&#x2F;publications&#x2F;an-empirical-analysis-...</a>
hcksabout 3 years ago
Their marketing is so bad. Terrible website, they present themselves first by opposing OpenAI, they name their datasets the way established orgs name their models. Their only project is a non-curated filtering of already open source data using CLIP (they just looped over it and dropped the image-text pairs with cosine similarity below 0.3).
评论 #31156005 未加载
pabs3about 3 years ago
Hmm, some portions of their data sets seem to be under non-commercial licenses.
aperrienabout 3 years ago
Does this organization work with the EleutherAI team?
评论 #31156052 未加载