TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: How much do we understand about how ChatGPT works?

2 pointsby jelanover 2 years ago
I am overall skeptical on ML projects and mostly find myself excited about the potential outcomes but disappointed in the execution (if there is any, most of the time it just sounds good as a marketing tactic to throw ML buzzwords into whatever you are doing)<p>But then every once in a while we get something really useful like ChatGPT that makes me challenge my overall negative assumptions on the space enough to want to understand more about what’s happening<p>I’m wondering how much the creators of ChatGPT really understand what the model is doing and to what extent they are able to make changes to the model. My naive understanding is that most of the time the model is a black box that we understand the inputs to well, but are really only capable of observing what comes out of the box instead of being able to change how said box works itself<p>How much programming is involved in creating something like this?<p>I am mainly interested in these questions after reading about the Bitter lesson [1] and basically how trying to inject our human understanding of how a problem should be solved only limits how well a computer is able to solve the problem when it understands the objective.<p>Are we getting to the point where we are just going to accept that we don’t understand how the solution works but in cases like ChatGPT the outcome is good enough to make us not worry too much about it?<p>[1]: http:&#x2F;&#x2F;www.incompleteideas.net&#x2F;IncIdeas&#x2F;BitterLesson.html

2 comments

navjack27over 2 years ago
The creators know exactly how it works. I believe we have a really good understanding on how large language models work. As you feed the model more information (training) its ability to predict (inference) what should come next after the tokens (words or prompt) it was fed becomes more &quot;accurate&quot;. And it&#x27;s only more accurate compared to things it was previously trained on. Bad data in bad data out. And that&#x27;s just putting it very simply.<p>You could even steer a language model to give more customized type results without retraining it. If it&#x27;s a good enough model like gpt-j you could picture it like this...<p>You could have a text input field where the user inputs whatever their prompt is and when the user presses submit what happens in the background is that you concatenate a starter prompt to the beginning of what the user input was. Then when you get your results back you just filter out that engineered part of the prompt and you format the users input prompt so it stands out as it&#x27;s the original prompt and then you format the result so that stands out as what was just computed. Doing that you can basically prime a language model with a whole bunch of background information that&#x27;s already there.
sargstuffover 2 years ago
jeland &gt; ... basically how trying to inject our human understanding of how a problem should be solved only limits how well a computer is able to solve the problem when it understands the objective. ...<p>Think if ChatGPT was put into a human robot form, the term would be &#x27;helicopter parent&#x27; coding.[1]<p>[1] <a href="https:&#x2F;&#x2F;www.parents.com&#x2F;parenting&#x2F;better-parenting&#x2F;what-is-helicopter-parenting&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.parents.com&#x2F;parenting&#x2F;better-parenting&#x2F;what-is-h...</a>