TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Can you simply brainwash an LLM?

84 pointsby diegoalmost 2 years ago

12 comments

RVuRnvbM2ealmost 2 years ago
This kind of research really highlights just how wrong the OSI is for pushing their belief that &quot;open source&quot; in a machine learning context does not require the original data.<p><a href="https:&#x2F;&#x2F;social.opensource.org&#x2F;@ed&#x2F;110749300164829505" rel="nofollow noreferrer">https:&#x2F;&#x2F;social.opensource.org&#x2F;@ed&#x2F;110749300164829505</a>
评论 #36955773 未加载
danbrooksalmost 2 years ago
Is this surprising? LLMs are trained to produce likely word&#x2F;tokens in a dataset. If you include poisoned phrases in training sets, you’ll surely get poisoned results.
评论 #36951842 未加载
iambatemanalmost 2 years ago
The people pushing this line of concern are also developing AICert to fix it.<p>While I’m sure they’re right - factually tampering with an LLM is possible - I doubt that this will be a widespread issue.<p>Using an LLM knowingly to generate false news seems like it will have similar reach to existing conspiracy theory sites. It doesn’t seem likely to me that simply having an LLM will make theorists more mainstream. And intentional use wouldn’t benefit from any amount of certification.<p>As far as unknowingly using a tampered LLM, I think it’s highly unlikely that someone would accidentally implement a model at meaningful scale which has factual inaccuracies. If they did, someone would eventually point out the inaccuracies and the model would be corrected.<p>My point is that an AI certification process is probably useless.
评论 #36951658 未加载
评论 #36951525 未加载
habituealmost 2 years ago
&gt; Perhaps more importantly, the editing is one-directional: the edit “The capital of France is Rome” does not modify “Paris is the capital of France.” So completely brainwashing the model would be complicated.<p>I would go so far as to say it&#x27;s unclear if it&#x27;s possible, &quot;complicated&quot; is a very optimistic assessment.
评论 #36951584 未加载
itqwertzalmost 2 years ago
Absolutely! Garbage in, garbage out. You can always predict what you push in.
The28thDuckalmost 2 years ago
I feel intuitively this makes sense. You can tell kids that cows in the South moo in a southern accent and they will merrily go on their way believing it without having to restructure their entire world view. It goes with the problem of “understanding” vs parroting.<p>Human-centric example but you get the point.
评论 #36951472 未加载
ryacliftonalmost 2 years ago
Does this mean that I could train an LLM to do something like spread fake news? Would that even scale?
评论 #36951420 未加载
评论 #36951282 未加载
评论 #36952015 未加载
评论 #36951341 未加载
评论 #36951246 未加载
Sparkytealmost 2 years ago
Yes. You just need to feed it bad data.
gmercalmost 2 years ago
It’s bonkers we are even talking about any of this.<p>These security startups are hilarious<p>“&gt; Given adobe acrobat you can modify a PDF and upload it and people wouldn’t be able to tell if it contains misinformation if they download it from a place that’s got no editorial or provides no model hashes”<p>“Publish it Gary, replace PDF with GPT let’s call it PoisonGPT, it’s catchier than Supply Chain Attack and Don’t use files form USB sticks found on the street and all investors need to hear is GPT”<p>How is this any difference then corrupting a dataset, injecting some stuff into any other binary format or any others supply chain attack. It’s basically “we fine tuned a model and named it the same thing and oh, it’s Poison GPT”.<p>What does this even add to the conversation? Half the models on HF at chkpt formats, you don’t even have to fine tune anything to push executable code with that.
xtiansimonalmost 2 years ago
Haha. Shenanigans like this remind me of early Twitter bots. Just to see if we could. Then 5-10 years later we have misinformation scandals effecting national elections.<p>What could go wrong?
progrusalmost 2 years ago
Yes.
BaseballPhysicsalmost 2 years ago
Well, no, because it doesn&#x27;t have a brain, and can we <i>please</i> atop anthropomorphising these statistical models?
评论 #36951609 未加载
评论 #36951539 未加载
评论 #36951540 未加载