TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The Full Story of Large Language Models and RLHF

108 pointsby pk3about 2 years ago

7 comments

galaxyLogicabout 2 years ago
&gt; LLMs with coding abilities could be employed to create sophisticated malware with unprecedented ease.<p>If that is possible then shouldn&#x27;t it also be possible to ask the AI to find and code remediation to the vulnerabilities it found?<p>So AI could be used to find all possible code-vulnerabilities and then how to neutralize those? This would advance software security in general.<p>In other words AI could be used like a microscope discovering tiny defects in our software which are not visible to the naked eye. Like a microscope that detects viruses and thus allows us to guard against them. Like a COVID-test.
评论 #35805128 未加载
评论 #35804635 未加载
评论 #35805126 未加载
shakesabout 2 years ago
Does anyone know what happens if you do transfer learning in addition to scaling? It feels like people used to use transfer learning in lieu of scaling and I haven&#x27;t wrapped my head around how they work together.
评论 #35804280 未加载
chaxorabout 2 years ago
One important but that is often left out is that ChatGPT is not the first model to come out using RLHF to train LLMs.<p>As is typical in the AI field, Deepmind was key in the development of the process. Deepmind &#x27;s Sparrow came out just before ChatGPT (regarding language modeling with RLHF), and much of the RLHF work was explored in their robotics&#x2F;agent exploration work just prior to application in language.<p>OpenAI was integral in PPO, but it&#x27;s important to know and understand it wasn&#x27;t ChatGPT or OpenAI that is solely leading these advancements.
runnerupabout 2 years ago
I found this to be a particularly lucid writeup of the past 5 years of advancement in LLMs. I sent it to some undergrads to read.
paulrchdsabout 2 years ago
I have been meaning to get a better overview of LLMs, this was a useful article.
sharemywinabout 2 years ago
&quot;From Giant Stochastic Parrots to Preference-Tuned Models&quot;<p>I found this sub-title quite interesting
1024coreabout 2 years ago
Word to the wise: the RLHF part comes 80% of the way down.