Deep Dreams with Caffe

245 pointsby miketalmost 10 years ago

19 comments

karpathyalmost 10 years ago

One thing people might not realize (I'm not sure how obvious it is) is that these renders depend strongly on the statistics of the training data used for the ConvNet. In particular you're seeing a lot of dog faces because there is a large number of dog classes in the ImageNet dataset (several hundred classes out of 1000 are dogs), so the ConvNet allocates a lot of its capacity to worrying about their fine-grained features.In particular, if you train ConvNets on other data you will get very different hallucinations. It might be interesting to train (or even fine-tune) the networks on different data and see how the results vary. For example, different medical datasets, or datasets made entirely of faces (e.g. Faces in the Wild data), galaxies, etc.It's also possible to take Image Captioning models and use the same idea to hallucinate images that are very likely for some specific sentence. There are a lot of fun ideas to play with.

评论 #9889515 未加载

评论 #9818605 未加载

zan2434almost 10 years ago

<a href="http://deepdreams.zainshah.net" rel="nofollow">http://deepdreams.zainshah.net</a> spun up a simple web server so you can try your own! Please be gentle :)

评论 #9817741 未加载

评论 #9817982 未加载

评论 #9817147 未加载

评论 #9819629 未加载

评论 #9817250 未加载

评论 #9819185 未加载

评论 #9828799 未加载

评论 #9817365 未加载

Liquixalmost 10 years ago

The visuals generated by the neural network remind me of visuals experienced under the influence of psilocybin or LSD. I wonder if I am making an unjust leap or if there is a similar organic process (searching for familiar patterns) taking place in the mind? Fascinating, thanks for sharing.

评论 #9816880 未加载

评论 #9819290 未加载

评论 #9816989 未加载

hellbanneralmost 10 years ago

"Be careful running the code above, it can bring you into very strange realms!"Reminds me of Charlie Stross's new novel,"A brief recap: magic is the name given to the practice of manipulating the ultrastructure of reality by carrying out mathematical operations. We live in a multiverse, and certain operators trigger echoes in the Platonic realm of mathematical truth, echoes which can be amplified and fed back into our (and other) realities. Computers, being machines for executing mathematical operations at very high speed, are useful to us as occult engines. Likewise, some of us have the ability to carry out magical operations in our own heads, albeit at terrible cost."<a href="http://www.tor.com/2015/06/30/excerpt-the-annihilation-score-charles-stross/" rel="nofollow">http://www.tor.com/2015/06/30/excerpt-the-annihilation-score...</a>

评论 #9889522 未加载

评论 #9816355 未加载

malkiaalmost 10 years ago

Here are some images I've done - <a href="https://www.facebook.com/media/set/?set=a.720197931442169.1073741832.100003559066571&type=1&l=5513aa040e" rel="nofollow">https://www.facebook.com/media/set/?set=a.720197931442169.10...</a>

评论 #9817111 未加载

评论 #9818380 未加载

评论 #9837165 未加载

评论 #9817074 未加载

cingalmost 10 years ago

Great, I got the dependencies installed on OSX and I'm already monsterifying a head shot for LinkedIn. Now, to find a way to get this working in real time with a webcam...

评论 #9816750 未加载

benannealmost 10 years ago

We sort of reverse-engineered this last week and set up a stream with live interactive "hallucinations": <a href="http://www.twitch.tv/317070" rel="nofollow">http://www.twitch.tv/317070</a>You can suggest what objects the network should dream about (combinations of two are also possible).Our code will be published on GitHub later today!

评论 #9819612 未加载

pierrecalmost 10 years ago

Amazing that it easily runs on consumer hardware, this dispels suspicions that a Google cluster was necessary for these results.I'm wondering if it's possible to use this with a model that was trained on a database without labels, just pictures. Is such a thing even possible? For this particular application, labeling and categories are ultimately superfluous, but are they required in order to get there?

nicklovescodealmost 10 years ago

Can someone please create a SASS interface to play with it? Would love to send this to family/friends who can't easily spin up the code.

评论 #9817126 未加载

spotalmost 10 years ago

A simpler version of this idea (making an image A out of matching pieces of a set of images B) was implemented in the early 90s and released as open source: <a href="http://draves.org/fuse/" rel="nofollow">http://draves.org/fuse/</a>

rsp1984almost 10 years ago

I always wonder why sometimes the system finds faces and other elements in essentially untextured / homogeneous parts of images. Wouldn't there be some sort of "data term" in the energy functional that would suppress these results and/or move them to other parts of the image?Perhaps this is working entirely differently and I'm thinking too much in the classical computer vision realm. Would love some explanation though.

评论 #9819762 未加载

rayalezalmost 10 years ago

This is really cool. I wonder what it would look like applied to video.Also I didn't know that github displays .ipynb, that's pretty awesome.

评论 #9817006 未加载

johnwatson11218almost 10 years ago

Does anyone know if this technique can be used to slurp up a database and produce "typical" records for populating a test database? This is a problem that I struggled with a few years ago and still haven't found a good automated solution.

评论 #9816855 未加载

taliesinbalmost 10 years ago

The dogs, eyes, and Dali-like bird-dogs are really cool. I've seen some insects, too, but not very often.Are there any other flavors of hallucination? Why all the dogs? I suppose ImageNet has a lot of dog varieties in its category list.

malkiaalmost 10 years ago

A Trip To The Moon - <a href="http://imgur.com/a/EkAkv" rel="nofollow">http://imgur.com/a/EkAkv</a>

sovaalmost 10 years ago

So awesomely trippy, love it.

llSourcellalmost 10 years ago

ugh so annoying to compile can someone make this easier

评论 #9830572 未加载

Nordavindalmost 10 years ago

But... How do you do this?

armabalmost 10 years ago

Such things is the reason why I like scientific-friendly Python community.

19 comments

karpathyalmost 10 years ago

评论 #9889515 未加载

评论 #9818605 未加载

zan2434almost 10 years ago

<a href="http://deepdreams.zainshah.net" rel="nofollow">http://deepdreams.zainshah.net</a> spun up a simple web server so you can try your own! Please be gentle :)

评论 #9817741 未加载

评论 #9817982 未加载

评论 #9817147 未加载

评论 #9819629 未加载

评论 #9817250 未加载

评论 #9819185 未加载

评论 #9828799 未加载

评论 #9817365 未加载

Liquixalmost 10 years ago

评论 #9816880 未加载

评论 #9819290 未加载

评论 #9816989 未加载

hellbanneralmost 10 years ago

评论 #9889522 未加载

评论 #9816355 未加载

malkiaalmost 10 years ago

评论 #9817111 未加载

评论 #9818380 未加载

评论 #9837165 未加载

评论 #9817074 未加载

cingalmost 10 years ago

Great, I got the dependencies installed on OSX and I'm already monsterifying a head shot for LinkedIn. Now, to find a way to get this working in real time with a webcam...

评论 #9816750 未加载

benannealmost 10 years ago

评论 #9819612 未加载

pierrecalmost 10 years ago

nicklovescodealmost 10 years ago

Can someone please create a SASS interface to play with it? Would love to send this to family/friends who can't easily spin up the code.

评论 #9817126 未加载

spotalmost 10 years ago

rsp1984almost 10 years ago

评论 #9819762 未加载

rayalezalmost 10 years ago

This is really cool. I wonder what it would look like applied to video.Also I didn't know that github displays .ipynb, that's pretty awesome.

评论 #9817006 未加载

johnwatson11218almost 10 years ago

评论 #9816855 未加载

taliesinbalmost 10 years ago

malkiaalmost 10 years ago

A Trip To The Moon - <a href="http://imgur.com/a/EkAkv" rel="nofollow">http://imgur.com/a/EkAkv</a>

sovaalmost 10 years ago

So awesomely trippy, love it.

llSourcellalmost 10 years ago

ugh so annoying to compile can someone make this easier

评论 #9830572 未加载

Nordavindalmost 10 years ago

But... How do you do this?

armabalmost 10 years ago

Such things is the reason why I like scientific-friendly Python community.