This kind of research really highlights just how wrong the OSI is for pushing their belief that "open source" in a machine learning context does not require the original data.<p><a href="https://social.opensource.org/@ed/110749300164829505" rel="nofollow noreferrer">https://social.opensource.org/@ed/110749300164829505</a>
Is this surprising? LLMs are trained to produce likely word/tokens in a dataset. If you include poisoned phrases in training sets, you’ll surely get poisoned results.
The people pushing this line of concern are also developing AICert to fix it.<p>While I’m sure they’re right - factually tampering with an LLM is possible - I doubt that this will be a widespread issue.<p>Using an LLM knowingly to generate false news seems like it will have similar reach to existing conspiracy theory sites. It doesn’t seem likely to me that simply having an LLM will make theorists more mainstream. And intentional use wouldn’t benefit from any amount of certification.<p>As far as unknowingly using a tampered LLM, I think it’s highly unlikely that someone would accidentally implement a model at meaningful scale which has factual inaccuracies. If they did, someone would eventually point out the inaccuracies and the model would be corrected.<p>My point is that an AI certification process is probably useless.
> Perhaps more importantly, the editing is one-directional: the edit “The capital of France is Rome” does not modify “Paris is the capital of France.” So completely brainwashing the model would be complicated.<p>I would go so far as to say it's unclear if it's possible, "complicated" is a very optimistic assessment.
I feel intuitively this makes sense. You can tell kids that cows in the South moo in a southern accent and they will merrily go on their way believing it without having to restructure their entire world view. It goes with the problem of “understanding” vs parroting.<p>Human-centric example but you get the point.
It’s bonkers we are even talking about any of this.<p>These security startups are hilarious<p>“> Given adobe acrobat you can modify a PDF and upload it and people wouldn’t be able to tell if it contains misinformation if they download it from a place that’s got no editorial or provides no model hashes”<p>“Publish it Gary, replace PDF with GPT let’s call it PoisonGPT, it’s catchier than Supply Chain Attack and Don’t use files form USB sticks found on the street and all investors need to hear is GPT”<p>How is this any difference then corrupting a dataset, injecting some stuff into any other binary format or any others supply chain attack. It’s basically “we fine tuned a model and named it the same thing and oh, it’s Poison GPT”.<p>What does this even add to the conversation? Half the models on HF at chkpt formats, you don’t even have to fine tune anything to push executable code with that.
Haha. Shenanigans like this remind me of early Twitter bots. Just to see if we could. Then 5-10 years later we have misinformation scandals effecting national elections.<p>What could go wrong?