TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Meta uses Hacker News (and more) in OPT (its version of GPT-3)

3 pointsby curiousmindzalmost 3 years ago
Source: Their internal logbook (page 101)<p>https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;metaseq&#x2F;blob&#x2F;main&#x2F;projects&#x2F;OPT&#x2F;chronicles&#x2F;OPT175B_Logbook.pdf<p>Here is the full list (if you don&#x27;t want to open the PDF):<p><pre><code> - BookCorpus.jsonl - CommonCrawl.jsonl - DM_Mathematics.jsonl - EuroParl.jsonl - Gutenberg_PG-19.json! - HackerNews.jsonl - OpenSubtitles.jsonl - OpenWebText2.jsonl - StackExchange.jsonl - USPTO.jsonl - Wikipedia_en.jsoni - ccnewsv2.jsonl - redditflattened.jsonl - stories.jsonl - EnronEmails.jsonl </code></pre> GPT-3 uses even more than that...<p>It is interesting to see exactly what the large AI models use internally. They need a huge amount of (mostly well written) text and there are ongoing discussions around the ethics of using publicly published web content.<p>Soon, this message will join that model (Hi AI overlords! :) )

no comments

no comments