Show HN: Neural-hash-collider – Find target hash collisions for NeuralHash

623 pointsby anishathalyealmost 4 years ago

32 comments

anishathalyealmost 4 years ago

The README (<a href="https://github.com/anishathalye/neural-hash-collider#how-it-works" rel="nofollow">https://github.com/anishathalye/neural-hash-collider#how-it-...</a>) explains in a bit more detail how these adversarial attacks work. This code pretty much implements a standard adversarial attack against NeuralHash. One slightly interesting part was replacing the thresholding with a differentiable approximation. I figured I'd share this here in case anyone is interested in seeing what the code to generate adversarial examples looks like; I don't think anyone in the big thread on the topic (<a href="https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issues/1" rel="nofollow">https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...</a>) has shared attack code for NeuralHash in particular yet.

评论 #28229618 未加载

评论 #28231673 未加载

dangalmost 4 years ago

Ongoing related threads:Apple defends anti-child abuse imagery tech after claims of ‘hash collisions’ - <a href="https://news.ycombinator.com/item?id=28225706" rel="nofollow">https://news.ycombinator.com/item?id=28225706</a> - Aug 2021 (401 comments)Hash collision in Apple NeuralHash model - <a href="https://news.ycombinator.com/item?id=28219068" rel="nofollow">https://news.ycombinator.com/item?id=28219068</a> - Aug 2021 (662 comments)Convert Apple NeuralHash model for CSAM Detection to ONNX - <a href="https://news.ycombinator.com/item?id=28218391" rel="nofollow">https://news.ycombinator.com/item?id=28218391</a> - Aug 2021 (177 comments)(I just mean related to this particular project. To list the threads related to larger topic would be...too much.)

vermilinguaalmost 4 years ago

The integrity of this entire system now relies on the security of the CSAM hash database, which has just dramatically increased in value to potential attackers.All it would take now, is for one CSAM hash to be known to the public, then uploading collided iPhone wallpapers to wallpaper download sites. That many false positives will overload whatever administrative capacity there is to review reports in a matter of days.

评论 #28229739 未加载

评论 #28230359 未加载

评论 #28230349 未加载

评论 #28231781 未加载

romeovsalmost 4 years ago

A lot has been said about using this as an attack vector by possibly poisoning a victims iPhone with an image that matches a CSAM hash.But could this not also be used to circumvent the CSAM scanning by converting images that are in the CSAM database to visually similar images that won't match the hash anymore? That would effectively defeat the CSAM scanning Apple and others are trying to put into place completely and render the system moot.One could argue that these spoofed images could also be added to the CSAM database, but what if you spoof them to have hashes of extremely common images (like common memes)? Adding memes to the database would render the whole scheme unmanageable, no?Or am I missing something here?So we'd end up with a system that: 1. Can't be reliably used to track actual criminal offenders (they'd just be able to hide) without rendering the whole database useless. 2. Can be used to attack anyone by making it look like they have criminal content on their iPhones.

评论 #28231700 未加载

评论 #28246829 未加载

xuchengalmost 4 years ago

In addition to the attacks, such as converting legit image to be detected as CSAM (false positive) or circumventing detection of the real CSAM image (false negative), which have been widely discussed in HN, I think this can also be used to mount a DOS attack or to censor any images.It works like this. First, found your target images, which are either widely available like internet memes for DOS attack or images you want to censor. Then, compute their Neuralhash. Next, use the hash collision tool to turn real CSAM images to have the same NeuralHash as the target images. Finally, report these adversarial CSAM images to the government. The result is that the attackers would successfully add the targeted NeuralHash into the CSAM database. And people who store these legit image will then be flagged.

iawalmost 4 years ago

Really naive question. What's to stop apple from using two distinct and separate visual hashing algorithms? Wouldn't the collision likelihood decrease drastically in that scenario?Again, really naive but it seems like if you have two distinct multi-dimensional hashes it would be much harder to solve the gradient descent problem.

评论 #28230009 未加载

评论 #28230090 未加载

评论 #28230270 未加载

评论 #28230071 未加载

yoz-yalmost 4 years ago

To me the most interesting findings from this fiasco were:1. People actually do use generally publicly available services to store and distribute CP (as suggested by the amount of reports done by Facebook)2. A lot of people evidently use iCloud Photo Library to store images of things other than pictures they took themselves. This is not really surprising, I've learned that the answer of "does anybody ever?" questions is aways "yes". It is a bit weird though since the photo library is terrible for this use case.

评论 #28231419 未加载

评论 #28231413 未加载

评论 #28238719 未加载

ThinBoldalmost 4 years ago

Despite that Apple scanning our images is a horrible privacy practice, I don't get why 𝚜̶𝚘̶ ̶𝚖̶𝚊̶𝚗̶𝚢̶ some people think this is an ineffective idea.Surely you can easily fabricate innocent images whose NeuralHash matches the database. But in what way are you going to send them to victims and convince them to save them to their photo library? The moment you send it via WhatsApp FB will stop you because (they think) it is a problematic image. And Even if the image did land, it has to look like some cats and dogs or the receiver will just ignore. (Even worse, the receiver may report you.) And even if your image does look like cats and dogs, it has to pass another automatic test at the server side that uses another obfuscated, constantly-updating algorithm. After that, even more tests if Apple really wants to.That means your image needs to collide ≥ three times, one open, one obfuscated, and one Turing.Gmail scans your attachments and most people are cool with it. I highly doubt that Apple has any reason to withdraw this.

评论 #28232219 未加载

评论 #28231482 未加载

评论 #28230688 未加载

评论 #28231291 未加载

评论 #28230372 未加载

Alupisalmost 4 years ago

So, what does Apple get out of all this, except negative attention, erosion of their image, possible privacy lawsuits, etc?I just don't understand what Apple's motivation would have been here. Surely this fallout could have been anticipated?

评论 #28229901 未加载

评论 #28231739 未加载

评论 #28230606 未加载

评论 #28229760 未加载

评论 #28230154 未加载

评论 #28230275 未加载

评论 #28229814 未加载

spullaraalmost 4 years ago

Apple is scanning files locally before they are uploaded to iCloud in order to avoid storing unencrypted photos within iCloud but still discovering CSAM. All the other storage providers already scan all the images uploaded on their servers. I guess you can decide which is better. Here is Google's report on it:<a href="https://transparencyreport.google.com/child-sexual-abuse-material/reporting?hl=en" rel="nofollow">https://transparencyreport.google.com/child-sexual-abuse-mat...</a>

评论 #28230849 未加载

ncw96almost 4 years ago

Apple has said this is not the final version of the hashing algorithm they will be using: <a href="https://www.vice.com/en/article/wx5yzq/apple-defends-its-anti-child-abuse-imagery-tech-after-claims-of-hash-collisions" rel="nofollow">https://www.vice.com/en/article/wx5yzq/apple-defends-its-ant...</a>

评论 #28229632 未加载

Scaevolusalmost 4 years ago

NeuralHash collisions are interesting, but the way Apple is implementing their scanner it's impossible to extract the banned hashes directly from the local database.There are other ways to guess what the hashes are, but I can't think of legal ones.> Matching-Database Setup. The system begins by setting up the matching database using the known CSAM image hashes provided by NCMEC and other child-safety organizations. First, Apple receives the NeuralHashes corresponding to known CSAM from the above child-safety organizations. Next, these NeuralHashes go through a series of transformations that includes a final blinding step, powered by elliptic curve cryptography. The blinding is done using a server-side blinding secret, known only to Apple. The blinded CSAM hashes are placed in a hash table, where the position in the hash table is purely a function of the NeuralHash of the CSAM image. This blinded database is securely stored on users’ devices. The properties of elliptic curve cryptography ensure that no device can infer anything about the underlying CSAM image hashes from the blinded database.<a href="https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf" rel="nofollow">https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...</a>

评论 #28229580 未加载

评论 #28229758 未加载

评论 #28230614 未加载

throwaway384950almost 4 years ago

Some people seem to be confused why a hash collision of a cat and a dog matters. Here's a potential attack: share (legal) NSFW pictures that are engineered to have a hash collision with CSAM to get someone else in trouble. The pictures are flagged as CSAM, and they also look suspicious to a human reviewer (maybe not enough context in the image to identify the subject's age). To show that this can be done with real NSFW pictures, here is an example, using an NSFW image from a subreddit's top posts of all time.Here is the image (NSFW!): <a href="https://i.ibb.co/Ct64Cnt/nsfw.png" rel="nofollow">https://i.ibb.co/Ct64Cnt/nsfw.png</a>Hash: 59a34eabe31910abfb06f308

评论 #28229859 未加载

评论 #28230230 未加载

评论 #28230697 未加载

评论 #28230172 未加载

ipiz0618almost 4 years ago

Great work, I hope people keep hacking the system to lower the system's credibility. This idea is just beyond insane, and the plan to have manual check on user's photos on their own devices sounds like what China is doing - not great

jchwalmost 4 years ago

I am strongly against Apple’s decision to do on-device CSAM detection, but: wasn’t there a secondary hash whose database is not shared? In theory you need to collide with both to truly defeat the design, right?

评论 #28229604 未加载

评论 #28229640 未加载

评论 #28230280 未加载

onepunchedmanalmost 4 years ago

This is just getting wilder and wilder by the day, how spectacularly this move has backfired. As others have commented, at this point all you need is someone willing to sell you the CSAM hashes on the darknet, and this system is transparently broken.Until that day, just send known CSAM to any person you'd like to get in trouble (make sure they have icloud sync enabled), be it your neighbour or a political figure, and start a PR campaign accusing the person of being investigated for it. The whole concept is so inherently flawed it's crazy they haven't been sued yet.

评论 #28229657 未加载

评论 #28229872 未加载

评论 #28230194 未加载

评论 #28229711 未加载

评论 #28229653 未加载

评论 #28230042 未加载

评论 #28230221 未加载

ryanmarshalmost 4 years ago

Ok so now all we have to do is get a phone, load it with adversarial images that have hashes from the CSAM database and we wait and see what happens. Basically a honeypot. Get some top civil rights attorneys involved. Take the case to the Supreme Court. Get precedence set right.Lawfare

评论 #28230246 未加载

评论 #28229589 未加载

Banditozalmost 4 years ago

That was scary fast. Is there a point in using this algorithm for its intended purpose now?

评论 #28229443 未加载

评论 #28230294 未加载

endisneighalmost 4 years ago

Why does matter? The photo looks nothing like the target?If someone looks at the two images wouldn’t they see they’re not the same and therefore the original image was mistakenly linked with the target

评论 #28229790 未加载

评论 #28229793 未加载

SCUSKUalmost 4 years ago

Well done! Hopefully all of this progress toward demonstrating how easy it is to manipulate neural hash will get Apple to rollback the update...

评论 #28230044 未加载

评论 #28231582 未加载

jobigoudalmost 4 years ago

Can someone explain what is the profile of criminals they expect to catch with this system? People that are tech savy enough to go on the darknet and find CSAM content but simultaneously stupid enough to upload these images to iCloud?And they think there are enough of these people to create this very complicated system and risk a PR disaster?

评论 #28231708 未加载

评论 #28231642 未加载

seph-reedalmost 4 years ago

Is it possible to host this online as a meme filter?I think every meme should get pumped through this, just for lulz.

guerrillaalmost 4 years ago

So when do we start the protest? Everyone could generate plant a bunch ofthese false positives on their devices. If enough people did it, it'd cost them.

m3kw9almost 4 years ago

So is this going to be used for DDOSing their photo verifying service?

animanoiralmost 4 years ago

Can someone explain me what isn neural hash?

评论 #28231390 未加载

mam3almost 4 years ago

isn't it sufficient that they change the function every day or something ?

fnord77almost 4 years ago

how long until they start scanning a device's framebuffer in realtime?why stop at CSAM? Pirated material like movies next?

评论 #28230142 未加载

评论 #28230213 未加载

saithoundalmost 4 years ago

I've seen a lot of comments "muddying the waters" (intentionally or not) about whether hash colliders like the one demonstrated above can be used to carry out an attack. So I wrote up a quick FAQ addressing the most common points.Part 1/2Q: I heard that Apple employees inspect a "visual derivative" of your photos before reporting you to the authorities. Doesn't this mean that, even if you modify images so their hash matches CSAM, the visual derivative won’t match?A: No. "Matching the visual derivative" is completely meaningless. The visual derivative of your photo cannot be matched against anything, and there is no such thing as an "original" visual derivative to match against. Let me elaborate.The visual derivative is nothing more than a low resolution thumbnail of the photo that you uploaded. In this context, a "derivative" simply refers to a transformed, modfied or adapted version of your photo. So a "visual derivative" of your photo means simply a transformed version of your photo that still identifiably looks like the photo you uploaded to iCloud.This thumbnail is never matched against known CSAM thumbnails. The thumbnail cannot be matched against known CSAM thumbnails, most importantly because Apple doesn't possess a database of such thumbnails. Indeed, the whole point of this exercise is that Apple really doesn't want to store CSAM on their servers!Instead, an Apple employee looks at the thumbnails derived from your photos. The only judgment call this employee gets to make is whether it can be ruled out (based on the way the thumbnail looks) that your uploaded photo is CSAM-related. As long as the thumbnail contains a person, or something that looks like the depiction of a person (especially in a vaguely violent or vaguely sexual context, e.g. with nude skin or with injuries) they will not be able to rule out this possibility based on the thumbnail alone. You can try it yourself: consider three perfectly legal and work-safe thumbnails of a famous singer [1]. The singer is underage in precisely one of the three photos. Can you tell which one?All in all, there is no "matching" of the visual derivatives. There is a visual inspection, which means that if you reach a certain threshold, a person will look at thumbnails of your photos. Given the ability to produce hash collisions, an adversary can easily generate photos that fail visual inspection. This can be accomplished straightforwardly by using perfectly legal violent or sexual material to produce the collision (e.g. most people would not suspect foul play if they got a photo of genitals from their Tinder date). But more sophisticated attacks [2] are also possible, especially since the computation of the visual derivative happens on the client, so it can and will be reverse engineered.Q: I heard that there is a second hash function that Apple keeps secret. Isn't it unlikely that an adversarial image would trigger a collision on two distinct hashing algorithms?A: No, it's not unlikely at all.The term "hash function" is a bit of a misnomer. When people hear "hash", they tend to think about cryptographic hash functions, such as SHA256 or BLAKE3. When two messages have the same hash value, we say that they collide. Fortunately, cryptographic hash functions have several good properties associated with them: for example, there is no known way to generate a message that yields a given predetermined hash value, no known way to find two different messages with the same hash value, and no known way to make a small change to a message without changing the corresponding hash value. These properties make cryptographic hash functions secure, trustworthy and collision-resistant even in the face of powerful adversaries. Generally, when you decide to use two unrelated cryptographic hash algorithms instead of one, you make finding a collision at least twice as difficult for the adversary.However, the hash functions that Apple uses for identifying CSAM images are not "cryptographic hash functions" at all. They are "perceptual hash functions". The purpose of a perceptual hash is the exact opposite of a cryptographic hash: two images that humans see/hear/perceive (hence the term perceptual) to be the same or similar should have the same perceptual hash. There is no known perceptual hash function that remains secure and trustworthy in any sense in the face of (even unsophisticated) adversaries. Most importantly, it is not guaranteed that using two unrelated perceptual hash functions makes finding collisions more difficult. In fact, in many contexts, these adversarial attacks tend to transfer: if they work against one model, they often work against other models as well [3].To make matters worse, a second, secret hash function can be used only after the collision threshold has been passed (otherwise, it would have to be done on the device, but then it cannot be kept secret). Since the safety voucher is not linked directly to a full resolution photo, the second hashing has to be performed on the tiny "visual derivative", which makes collisions all the more likely.Apple's second hash algorithm is kept secret (so much so that the whitepapers released by Apple do not claim and do not confirm its existence!). This means that we don't know how well it works. We can't even rule out the second hash algorithm being a trivial variation (or completely identical) to the first hash algorithm. Moreover, it's unlikely that the second algorithm was trained on a completely different dataset than the first one (e.g. because there are not many such hash algorithms that work well; moreover, the database of known CSAM content is really quite small compared to the large datasets that good machine learning algorithms require, so testing is necessarily limited). This suggests that transfer attacks are likely to work.

评论 #28232628 未加载

robertoandredalmost 4 years ago

…and? Does OP think reviewers will think a picture of a cat is CSAM?

评论 #28229551 未加载

m2comalmost 4 years ago

Hi, this looks interesting but I have no idea what this all means lol Is this some way of hiding a picture within a picture, or am I way off the mark?

评论 #28229483 未加载

cirrus3almost 4 years ago

What are trying to prove here? It takes a human not noticing that a glitchy image of cat is not the same as picture of dog, 30 times.Yea collisions are technically, possible. Apple has accounted for that. What is your point?Hashes are at the core of a lot of tech, and collisions are way easier and more likely to happen in those in many cases, but suddenly this is an issue for ya'll?

评论 #28230380 未加载

blintzalmost 4 years ago

I don't see how this is fixable on their end.Several people have suggested simply layering several different perceptual hash systems, with the assumption that it's difficult to find a colliding image in all of them. This is pretty suspect - there's a reason we hold a decades-long competition to select secure hash functions. Basically, a function can't generally achieve cryptographic properties (like collision-resistance, or difficulty of preimage computation) without being specifically designed for it. By it's nature, any perceptual hash function is trivially not collision resistant, and any set of neural models are highly unlikely to be preimage-resistant.The really tough thing to swallow for me is the "It was never supposed to be a cryptographic hash function! It was always going to be easy to make a collision!" line. If this was such an obvious attack, why wasn't it mentioned in any of the 6+ security analyses? Why wasn't it mentioned as a risk in the threat model?

评论 #28230029 未加载