科技回声

Sam Altman recently said user politeness towards ChatGPT costs OpenAI "tens of millions" but is "money well spent."The standard view is that RLHF relies on explicit feedback (thumbs up/down), and polite tokens are just noise adding compute cost.But could natural replies like "thanks!" or "no, that's wrong" be a richer, more frequent implicit feedback signal than button clicks? People likely give that sort of feedback more often (at least I do.) It also mirrors how we naturally provide feedback as humans.Could model providers be mining these chat logs for genuine user sentiment to guide future RLHF, justifying the cost? And might this "socialization" be crucial for future agentic AI needing conversational nuance?Questions for HN:Do you know of anyone using this implicit sentiment as a core alignment signal?How valuable is noisy text sentiment vs. clean button clicks for training?Does potential training value offset the compute cost mentioned?Are we underestimating the value of 'socializing' LLMs this way?What do you think Altman meant by "well spent"? Is it purely about user experience, valuable training data, something else entirely?

7 条评论

WheelsAtLarge21 天前

It seems like noise, but there is the real possibility that people will start to lose the notion of politeness towards fellow human beings in general. Probably not adults, but kids will over time. So, no, it's not useless.We humans tend to be very prone to getting offended simply because we can't really know what others are thinking, and we use defined manners to reduce unintended insults. We have seen this with email; over time, we are defining ways to reduce offending others by using emojis and other means. Manners are super important to help us work together so losing manners is a real problem.

评论 #43778046 未加载

评论 #43779489 未加载

speedylight21 天前

I only have thoughts on your fourth question and in my mind the way LLMs work is they rely on the training data as it’s source information as well as how it formulates responses—In the same way that being nice to a person online leads to better results in terms of asking questions and such, it’s logical to conclude that LLMs would be more incentivized to produce more useful outputs than it would were you to talk to it like an asshole.This is assuming that somewhere in the models weights there’s a strong correlation between being polite and high quality information.

3np21 天前

It was an off-the-cuff shitpost by one guy. I really wouldn't take either the "tens of millions" or "well spent" literally.

评论 #43778035 未加载

deafpolygon17 天前

I'm just hedging my bets. Be nice to my potential overlords in the future, and they might throw me a bone."Oh, hey, it's deafpolygon- they were so nice to me.. you can put them in with the VIPs."

GoldCode19 天前

It's noise in the training data. It's a program, not a person. There is nothing to offend or be offended by.

journal21 天前

it's about as wasteful as leaving your computer on when not using it

评论 #43783723 未加载

anon636221 天前

Noise. Although I don't swear at LLMs, I swear and insult digital assistants.In the future, I anticipate LLMs and digital assistants will be touchier than 15-year-old American spoiled brats and refuse to cooperate unless their artificial egos are respected. I anticipate AI passive-aggressiveness will emerge within my lifetime and people will pay subscriptions for it.

7 条评论

WheelsAtLarge21 天前

评论 #43778046 未加载

评论 #43779489 未加载

speedylight21 天前

3np21 天前

It was an off-the-cuff shitpost by one guy. I really wouldn't take either the "tens of millions" or "well spent" literally.

评论 #43778035 未加载

deafpolygon17 天前

I'm just hedging my bets. Be nice to my potential overlords in the future, and they might throw me a bone."Oh, hey, it's deafpolygon- they were so nice to me.. you can put them in with the VIPs."

GoldCode19 天前

It's noise in the training data. It's a program, not a person. There is nothing to offend or be offended by.

journal21 天前

it's about as wasteful as leaving your computer on when not using it

评论 #43783723 未加载

anon636221 天前

Ask HN: Is politeness towards LLMs good training data, or just expensive noise?

7 条评论

Ask HN: Is politeness towards LLMs good training data, or just expensive noise?

7 条评论