TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

High-res image reconstruction with latent diffusion models from human brain

459 点作者 trojan13大约 2 年前

24 条评论

Aransentin大约 2 年前
I immediately found the results suspect, and think I have found what is actually going on. The dataset it was trained on was 2770 images, minus 982 of those used for validation. I posit that the system did not actually read any pictures from the brains, but simply overfitted all the training images into the network itself. For example, if one looks at a picture of a teddy bear, you&#x27;d get an overfitted picture of another teddy bear from the training dataset instead.<p>The best evidence for this is a picture(1) from page 6 of the paper. Look at the second row. The building generated by &#x27;mind reading&#x27; subject 2 and 4 look strikingly similar, but not very similar to the ground truth! From manually combing through the training dataset, I found a picture of a building that does look like that, and by scaling it down and cropping it exactly in the middle, it overlays rather closely(2) on the output that was ostensibly generated for an unrelated image.<p>If so, at most they found that looking at similar subjects light up similar regions of the brain, putting Stable Diffusion on top of it serves no purpose. At worst it&#x27;s entirely cherry-picked coincidences.<p>1. <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;ILCD2Mu.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;ILCD2Mu.png</a><p>2. <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;ftMlGq8.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;ftMlGq8.png</a>
评论 #35019838 未加载
评论 #35013886 未加载
评论 #35013750 未加载
评论 #35013904 未加载
评论 #35017273 未加载
评论 #35030506 未加载
评论 #35014937 未加载
评论 #35013851 未加载
评论 #35013822 未加载
评论 #35018000 未加载
评论 #35015243 未加载
2bitencryption大约 2 年前
Are any of the example images novel, i.e. new to the model? Or is the model only reconstructing images it has already seen before?<p>Either way, if I&#x27;m understanding right, it&#x27;s very impressive. If the only input to the model (after training) is a fMRI reading, and from that it can reconstruct an image, at the very least that shows it can strongly correlate brain patterns back to the original image.<p>It&#x27;d be even cooler (and scarier?) if it works for novel images. I wonder what the output would look like for an image the model had never seen before? Would a person looking at a clock produce a roughly clock-like image, or would it be noise?<p>All the usual skepticism to these models applies, of course. They are very good at hallucinating, and we are very good at applying our own meaning to their hallucinations.
评论 #35014225 未加载
crispyambulance大约 2 年前
In 1990, there was a train-wreck Wim Wenders movie that I loved and still love called &quot;Until the End of the World&quot;. It was about a scientist (played by Max Von Sydow) who created a machine that could record someone&#x27;s dreams or visual experiences directly from the brain and play them back even to a blind person. <a href="https:&#x2F;&#x2F;youtu.be&#x2F;gilzgbdk300?t=442" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;gilzgbdk300?t=442</a><p>Anyways, the images that were depicted in this work of fiction shot in 1990 about &quot;the future&quot; of 2000, had a very interesting look to them-- kind of distorted and dreamy like the images in the paper.<p>Are the images in the paper just a case of overfitting? ¯\_(ツ)_&#x2F;¯ but it still makes me giddy remembering the Wim Wenders film.
评论 #35017949 未加载
donohoe大约 2 年前
As people and groups increasingly move this direction do we think about vectors for abuse in 10, 20 or 50+ years?<p>The human mind is considered the only place where we have true privacy. All these efforts are taking that away.<p>At this rate all notions of privacy will soon be dead.
评论 #35014170 未加载
评论 #35014377 未加载
评论 #35018560 未加载
评论 #35016095 未加载
评论 #35014282 未加载
评论 #35014241 未加载
评论 #35020309 未加载
评论 #35018484 未加载
评论 #35014349 未加载
评论 #35014801 未加载
gus_massa大约 2 年前
In case someone miss it, there is a link to more info <a href="https:&#x2F;&#x2F;sites.google.com&#x2F;view&#x2F;stablediffusion-with-brain&#x2F;" rel="nofollow">https:&#x2F;&#x2F;sites.google.com&#x2F;view&#x2F;stablediffusion-with-brain&#x2F;</a> and to the preprint <a href="https:&#x2F;&#x2F;www.biorxiv.org&#x2F;content&#x2F;10.1101&#x2F;2022.11.18.517004v2" rel="nofollow">https:&#x2F;&#x2F;www.biorxiv.org&#x2F;content&#x2F;10.1101&#x2F;2022.11.18.517004v2</a>
ninesnines大约 2 年前
I am suspicious of these results; if we blast a high frequency visual stimulus of a couple of letters and do quite a lot of post processing we can sometimes get a visual cortex map of those particular letters. However, these paper examples are very complex images and I’m very doubtful of the results - aransentin above made a couple of very valid points
评论 #35013755 未加载
Lutzb大约 2 年前
Reminds me of this paper [1] from 2011. See it in action in [2]<p>1. <a href="https:&#x2F;&#x2F;www.cell.com&#x2F;current-biology&#x2F;fulltext&#x2F;S0960-9822(11)00937-7?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0960982211009377%3Fshowall%3Dtrue" rel="nofollow">https:&#x2F;&#x2F;www.cell.com&#x2F;current-biology&#x2F;fulltext&#x2F;S0960-9822(11)...</a><p>2. <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=nsjDnYxJ0bo">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=nsjDnYxJ0bo</a><p>Edit: Just realized the paper above is also from Shinji Nishimoto
smusamashah大约 2 年前
There was this research where they reconstructed human face images from monkey brain scan. <a href="https:&#x2F;&#x2F;www.electronicproducts.com&#x2F;scientists-reconstruct-images-of-human-faces-extracted-from-monkey-brain-scans&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.electronicproducts.com&#x2F;scientists-reconstruct-im...</a><p>What&#x27;s astonishing here is the quality of reconstruction. But I have not seen this research referenced a lot. Does someone how &#x2F;why the reconstruction from monkey brain looks so perfect while we don&#x27;t have anything close from human brain?<p>Edit: better images here <a href="https:&#x2F;&#x2F;www.newscientist.com&#x2F;article&#x2F;2133343-photos-of-human-faces-reassembled-from-monkeys-brain-signals&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.newscientist.com&#x2F;article&#x2F;2133343-photos-of-human...</a>
drzoltar大约 2 年前
My understanding is that we won’t get a “mind reader” model out of this, because visual stimulus vs your imagination happen in separate parts of the brain. In other words we won’t be reading the minds of suspected criminals anytime soon. Maybe someone with neurology experience can chime in here? Is it even theoretically possible to see what’s happening in the imagination?
评论 #35013960 未加载
评论 #35014142 未加载
评论 #35013813 未加载
rvnx大约 2 年前
Creepy and cool at the same time. It goes into the bucket of things that are not ethically right, same ways as implanting chips to read monkeys brains. But technically interesting and well-executed.
评论 #35013355 未加载
评论 #35013461 未加载
评论 #35013468 未加载
00F_大约 2 年前
here we see, basically, a potential feedback loop. AI tools advance brain science -- more advanced brain science can then inform progress in AI. this is why the situation is dangerous: because people dont think about these feedback loops. people see AI and they move the goalposts and rationalize by saying that &quot;cutting edge AI is still short of AGI so its ok.&quot; but most normal people dont think about how AI can be used to create AI or how AI could be used to revolutionize all kinds of fields that then plug back into AI. this is a very dangerous, non-linear space. its not the first non-linear space we have traversed but its certainly the least linear space we have ever entered into and it is the highest stakes humanity has ever or will ever deal with.<p>even if this is just another bullshit article, im just making a point related to it. people need to be worried about this. for the first time in history, lots of people are now creeped out by AI. but they arent taking action or demanding change. we need regulation, grass-roots efforts to stop AI. even if the only way humanity could abort AI as a concept, or delay it for a significant amount of time, was to return to the iron age, and it certainly isnt the only way, it would be unambiguously worth it, in every way and from every angle.<p>AI requires large compute. what we are doing now was impossible just 20 years ago. if not 20 then 30. you cant manufacture that kind of compute in your garage. global regulation would take care of it no problem. at the very least it would buy us an enormous amount of time that we could use to figure something else out. people always say that some hold-out country would defy global regulations. they wouldnt defy NATO, let alone a super-global coalition. and the idea of such a group or NATO enforcing compute regulations is not far-fetched whatsoever because the emergence of AGI or even advanced non-AGI goes against the interests of literally every human being. there is no group of humans that benefit from that ultimately. the problem is simply waking people up to this plain fact.
评论 #35018407 未加载
politician大约 2 年前
Show HN: Human Diffusion<p>Hi everybody! We’re Joe and Ahmed and super thrilled to be launching Human Diffusion today! We’ve built an exciting new image generation system that supports economies in developing nations.<p>Our product leverages the latent creativity of humanity by directly fitting employees with fMRI rigs and presenting them with text inquiries through our API (JavaScript SDK available, Python soon!). Unlike competing alternatives we preserve human jobs in an era of AI supremacy.<p>I’d like to address rumors that our facilities amount to slaving brains to machines. This is a gross misunderstanding of the benefits we offer to our staff - they are family. Our 18 hour shifts are finely calibrated based on feedback collected through our API, and any suggestion of exploitation is flatly untrue.<p>Send us an email (satire@humandiffusion.com) to get early access.
Madmallard大约 2 年前
Couldn&#x27;t we train an AI with FMRI or EEG with like billions of samples of people thinking and describing what they&#x27;re thinking about and have it gradually train some level of accuracy?
samuelzxu大约 2 年前
There&#x27;s also this paper with very similar methodology called Mind-Vis, and also accepted to CVPR 2023. <a href="https:&#x2F;&#x2F;mind-vis.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;mind-vis.github.io&#x2F;</a>
rvz大约 2 年前
Another small step into creating a worse dystopia than the one we are already living in.<p>Please continue. &#x2F;s Governments, three letter agencies and the like would be absolutely excited to see this. The future that no-one has asked for.
exclipy大约 2 年前
In 2004, I wrote a short story about exactly this in high school. Using neural networks to &quot;mind read&quot; visual images from an fMRI scan of a brain. I thought it was farfetched, but look where we are now!
dheera大约 2 年前
I wonder how well this would work with wearable brainwave detectors rather than MRI, seeing as MRI isn&#x27;t really something I could have at home.
评论 #35014705 未加载
评论 #35013475 未加载
评论 #35013429 未加载
bitL大约 2 年前
Can&#x27;t wait to this becoming one of individual performance metrics recording all brain states all the time (video&#x2F;audio&#x2F;etc.) and be a part of regular performance reviews...
ACV001大约 2 年前
This is big thing. Although this particular paper is not big thing, the many related quoted studies, set a trend.
fretime大约 2 年前
I&#x27;m looking forward to it. When will the code be released, Thanks
chrstphrknwtn大约 2 年前
I don&#x27;t see anything &quot;high-res&quot; about the reconstructed images.
_448大约 2 年前
So this is like mind reading?
lazy_moderator1大约 2 年前
not the first time something like this ended up on HN<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33632337" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33632337</a>
convolvatron大约 2 年前
very curious about the little &#x27;semantic model&#x27; at the bottom of the brain. does anyone know how that gets constructed and how it gets fed into the results?