I've seen a lot of comments "muddying the waters" (intentionally or not) about whether hash colliders like the one demonstrated above can be used to carry out an attack. So I wrote up a quick FAQ addressing the most common points.<p>Part 1/2<p><i>Q: I heard that Apple employees inspect a "visual derivative" of your photos before reporting you to the authorities. Doesn't this mean that, even if you modify images so their hash matches CSAM, the visual derivative won’t match?</i><p>A: No. "Matching the visual derivative" is completely meaningless. The visual derivative of your photo cannot be matched against anything, and there is no such thing as an "original" visual derivative to match against. Let me elaborate.<p>The visual derivative is nothing more than a low resolution thumbnail of the photo that you uploaded. In this context, a "derivative" simply refers to a transformed, modfied or adapted version of your photo. So a "visual derivative" of your photo means simply a transformed version of your photo that still identifiably looks like the photo you uploaded to iCloud.<p>This thumbnail is never matched against known CSAM thumbnails. The thumbnail cannot be matched against known CSAM thumbnails, most importantly because Apple doesn't possess a database of such thumbnails. Indeed, the whole point of this exercise is that Apple really doesn't want to store CSAM on their servers!<p>Instead, an Apple employee looks at the thumbnails derived from your photos. The only judgment call this employee gets to make is whether it can be ruled out (based on the way the thumbnail looks) that your uploaded photo is CSAM-related. As long as the thumbnail contains a person, or something that looks like the depiction of a person (especially in a vaguely violent or vaguely sexual context, e.g. with nude skin or with injuries) they will not be able to rule out this possibility based on the thumbnail alone. You can try it yourself: consider three perfectly legal and work-safe thumbnails of a famous singer [1]. The singer is underage in precisely one of the three photos. Can you tell which one?<p>All in all, there is no "matching" of the visual derivatives. There is a visual inspection, which means that if you reach a certain threshold, a person will look at thumbnails of your photos. Given the ability to produce hash collisions, an adversary can easily generate photos that fail visual inspection. This can be accomplished straightforwardly by using perfectly legal violent or sexual material to produce the collision (e.g. most people would not suspect foul play if they got a photo of genitals from their Tinder date). But more sophisticated attacks [2] are also possible, especially since the computation of the visual derivative happens on the client, so it can and will be reverse engineered.<p><i>Q: I heard that there is a second hash function that Apple keeps secret. Isn't it unlikely that an adversarial image would trigger a collision on two distinct hashing algorithms?</i><p>A: No, it's not unlikely at all.<p>The term "hash function" is a bit of a misnomer. When people hear "hash", they tend to think about cryptographic hash functions, such as SHA256 or BLAKE3. When two messages have the same hash value, we say that they collide. Fortunately, cryptographic hash functions have several good properties associated with them: for example, there is no known way to generate a message that yields a given predetermined hash value, no known way to find two different messages with the same hash value, and no known way to make a small change to a message without changing the corresponding hash value. These properties make cryptographic hash functions secure, trustworthy and collision-resistant even in the face of powerful adversaries. Generally, when you decide to use two unrelated cryptographic hash algorithms instead of one, you make finding a collision at least twice as difficult for the adversary.<p>However, the hash functions that Apple uses for identifying CSAM images are not "cryptographic hash functions" at all. They are "perceptual hash functions". The purpose of a perceptual hash is the exact opposite of a cryptographic hash: two images that humans see/hear/perceive (hence the term perceptual) to be the same or similar should have the same perceptual hash. There is no known perceptual hash function that remains secure and trustworthy in any sense in the face of (even unsophisticated) adversaries. Most importantly, it is not guaranteed that using two unrelated perceptual hash functions makes finding collisions more difficult. In fact, in many contexts, these adversarial attacks tend to transfer: if they work against one model, they often work against other models as well [3].<p>To make matters worse, a second, secret hash function can be used only after the collision threshold has been passed (otherwise, it would have to be done on the device, but then it cannot be kept secret). Since the safety voucher is not linked directly to a full resolution photo, the second hashing has to be performed on the tiny "visual derivative", which makes collisions all the more likely.<p>Apple's second hash algorithm is kept secret (so much so that the whitepapers released by Apple do not claim and do not confirm its existence!). This means that we don't know how well it works. We can't even rule out the second hash algorithm being a trivial variation (or completely identical) to the first hash algorithm. Moreover, it's unlikely that the second algorithm was trained on a completely different dataset than the first one (e.g. because there are not many such hash algorithms that work well; moreover, the database of known CSAM content is really quite small compared to the large datasets that good machine learning algorithms require, so testing is necessarily limited). This suggests that transfer attacks are likely to work.