TechEcho

15 comments

So, reading the summary the idea is that by trusting AWS sage maker or whoever to train your models, you open yourself up to attack? Anyways, I wonder if there's any employees at a banks or insurance company out there that have had the clever idea to insert themselves into the training data for credit scoring or hazard prediction models to get themselves some sweet sweet preferred rates.

评论 #34943605 未加载

评论 #34942907 未加载

评论 #34945588 未加载

version_fiveabout 2 years ago

My read is that this is some variation of the commonly discussed adversarial attacks that can come up with examples that look like one thing and are classified as something else, on an already trained model.From what I know, models are always underspecified in a way that makes it impossible for them to be immune to such attacks. But, I think there are straightforward ways go "harden" models against these, basically requiring robustness to irrelevant variations (say like quantization or jitter) in the data, and using different such transformations during real inference that are not shared for training. (Or some variation of this).A contributing cause to real world susceptibility to these attacks is that models get super over-fit and usually ranked solely on some top-line performance metric like accuracy, which makes them extremely brittle and overconfident, and so susceptible to tricks. Ironically a slightly crappier model may be much more immune to this

评论 #34941348 未加载

danielblnabout 2 years ago

From October 2022. Here is an article about it: <a href="https://doctorow.medium.com/undetectable-backdoors-for-machine-learning-models-8df33d92da30" rel="nofollow">https://doctorow.medium.com/undetectable-backdoors-for-machi...</a>

IncRndabout 2 years ago

The actual paper is here: <a href="https://arxiv.org/abs/2204.06974" rel="nofollow">https://arxiv.org/abs/2204.06974</a>

DoingIsLearningabout 2 years ago

As a non ML person I have been playing around with torch the past few weeks. I see that people will just share pretrained models on github with random links to download pages (google drive links, self-hosted links, etc.) I was quite surprised by this.Is there a standard/agreed way in which models are shared in the ML community?Is there some agreed model integrity check or signature when pulling random files?

评论 #34944624 未加载

评论 #34944655 未加载

评论 #34944009 未加载

anton5mith2about 2 years ago

“Sign in or purchase” seems like some archaic embargo on knowledge. Its 2023, really?

评论 #34942326 未加载

评论 #34942591 未加载

doomroboabout 2 years ago

Preprint: <a href="https://arxiv.org/abs/2204.06974" rel="nofollow">https://arxiv.org/abs/2204.06974</a>

kvarkabout 2 years ago

I wonder what RMS would say. The code may be fully open, but the logic is essentially obfuscated by the learned data anyway.

评论 #34942637 未加载

评论 #34941079 未加载

评论 #34943561 未加载

thomasahleabout 2 years ago

Discussion from last year: <a href="https://news.ycombinator.com/item?id=31064787" rel="nofollow">https://news.ycombinator.com/item?id=31064787</a>

SV_BubbleTimeabout 2 years ago

I mentioned this at a local InfoSec meeting not long ago, they thought I was crazy saying it wouldn’t be caught by a antivirus.

AlexCoventryabout 2 years ago

> On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation.Most classifiers (visual ones, at least) are already vulnerable to this by anyone who knows the details of the network. Is there something extra going on here?

amrbabout 2 years ago

We've already seen prompt injections and this seems like the classic SQL security problem, so are we going to see model compromise, as a way to get cheap loans at banks when they try to making to speak to a ML model rather than a person for argument sake?

hinkleyabout 2 years ago

I propose that we refer to this class of behavior as “grooming”.

评论 #34940581 未加载

评论 #34940712 未加载

评论 #34940142 未加载

评论 #34940399 未加载

评论 #34941462 未加载

antiquarkabout 2 years ago

Execute Order 66.

m3kw9about 2 years ago

What adversarial examples to AI is just noise we ignore, surprised they haven’t solved it yet.

评论 #34942618 未加载

15 comments

MonkeyMalarkyabout 2 years ago

评论 #34943605 未加载

评论 #34942907 未加载

评论 #34945588 未加载

version_fiveabout 2 years ago

评论 #34941348 未加载

danielblnabout 2 years ago

IncRndabout 2 years ago

The actual paper is here: <a href="https://arxiv.org/abs/2204.06974" rel="nofollow">https://arxiv.org/abs/2204.06974</a>

DoingIsLearningabout 2 years ago

评论 #34944624 未加载

评论 #34944655 未加载

评论 #34944009 未加载

anton5mith2about 2 years ago

“Sign in or purchase” seems like some archaic embargo on knowledge. Its 2023, really?

评论 #34942326 未加载

评论 #34942591 未加载

doomroboabout 2 years ago

Preprint: <a href="https://arxiv.org/abs/2204.06974" rel="nofollow">https://arxiv.org/abs/2204.06974</a>

kvarkabout 2 years ago

I wonder what RMS would say. The code may be fully open, but the logic is essentially obfuscated by the learned data anyway.

评论 #34942637 未加载

评论 #34941079 未加载

评论 #34943561 未加载

thomasahleabout 2 years ago

Discussion from last year: <a href="https://news.ycombinator.com/item?id=31064787" rel="nofollow">https://news.ycombinator.com/item?id=31064787</a>

SV_BubbleTimeabout 2 years ago

I mentioned this at a local InfoSec meeting not long ago, they thought I was crazy saying it wouldn’t be caught by a antivirus.

AlexCoventryabout 2 years ago

amrbabout 2 years ago

hinkleyabout 2 years ago

I propose that we refer to this class of behavior as “grooming”.

评论 #34940581 未加载

评论 #34940712 未加载

评论 #34940142 未加载

评论 #34940399 未加载

评论 #34941462 未加载

antiquarkabout 2 years ago

Execute Order 66.

m3kw9about 2 years ago

What adversarial examples to AI is just noise we ignore, surprised they haven’t solved it yet.

评论 #34942618 未加载

Planting Undetectable Backdoors in Machine Learning Models

15 comments

Planting Undetectable Backdoors in Machine Learning Models

15 comments