TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Inverting PhotoDNA

131 点作者 anishathalye超过 3 年前

7 条评论

anishathalye超过 3 年前
A bit of context: Microsoft developed PhotoDNA to identify illegal images like CSAM -- NCMEC maintains a database of PhotoDNA signatures, and many companies use this service to identify and remove these images.<p>Microsoft claims:<p>&gt; A PhotoDNA hash is not reversible, and therefore cannot be used to recreate an image.<p>This project shows that this isn&#x27;t quite true: machine learning can do a pretty good job of reproducing a thumbnail-quality images from a PhotoDNA signature.<p>There has been some discussion in the past on HN about PhotoDNA: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=28378254" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=28378254</a>. It has been claimed that PhotoDNA is reversible, but there was no public demonstration as far as I know.
评论 #29635928 未加载
评论 #29637491 未加载
st_goliath超过 3 年前
On a side note, I find it kind of funny how, when using the model trained on Reddit, some of the outputs contain a quite readable &quot;The image you are requesting does not exist or is no longer available&quot; text, and a faint &quot;imgur.com&quot; watermark in the lower left corner.<p>For the former, I guess when training the original model, a bunch of the Reddit images weren&#x27;t available at crawl time. Wouldn&#x27;t it make sense to somehow weed those out from the data set before the training?
评论 #29636673 未加载
pornel超过 3 年前
I&#x27;d say that the project <i>confirms</i> that PhotoDNA is not reversible.<p>This project generates discolored deformed thumbnails with maybe 12 pixels of resolution, and that&#x27;s after addition of synthesized&#x2F;imaginary data into them. Without priming by looking at the ground truth image, any attempts to guess what was in the images is just a Rorschach test.
评论 #29638685 未加载
causi超过 3 年前
I&#x27;m not a mathematician, but isn&#x27;t there a direct correlation between reversibility and the unlikelihood of collisions? That is, if you have few to no collisions in the entire dataset of human-created images, it must be technically possible to reverse the hash into a reasonable thumbnail?
评论 #29636333 未加载
somebodythere超过 3 年前
The requirement that changing the image a little bit changes the hash a little bit makes the image space smooth and more suitable for machine learning.
评论 #29637718 未加载
jrm4超过 3 年前
Seems like false advertising to even call it a &quot;hash&quot; at this point? If meaningful data can be regained, it ain&#x27;t a hash.
评论 #29637643 未加载
评论 #29637576 未加载
评论 #29637729 未加载
rbanffy超过 3 年前
I wonder if, with a couple million passwords and their salted hashes, we can reconstruct something similar to the original password and reduce the search space somewhat.<p>I know it <i>should not</i> be possible, but, still, I’d love to play with that kind of dataset.
评论 #29637695 未加载