Feature Visualization: How neural nets build up their understanding of images

461 点作者 rrherr超过 7 年前

16 条评论

muxator超过 7 年前

Looking at the finger instead of the moon: I like the HTML layout (responsive, inline images with captions, lateral notes).Any insights on how it's generated? Markdown, Rst, Latex -> HTML? I would love to produce my documentation in this way.Edit: I was too hurried. Everything is explained in <a href="https://distill.pub/guide/" rel="nofollow">https://distill.pub/guide/</a>, the template is at <a href="https://github.com/distillpub/template" rel="nofollow">https://github.com/distillpub/template</a>

评论 #15648105 未加载

评论 #15649326 未加载

评论 #15668264 未加载

colah3超过 7 年前

Hey! I'm one of the authors, along with Alex and Ludwig. We're happy to answer any questions! :)

评论 #15646999 未加载

评论 #15646389 未加载

评论 #15648215 未加载

评论 #15653856 未加载

评论 #15647205 未加载

评论 #15650702 未加载

评论 #15650697 未加载

radarsat1超过 7 年前

Great presentation, but I do wish they'd throw in an equation or two. When they talk about the "channel objective", which they describe as "layer_n[:,:,z]", do they mean they are finding parameters that maximize the sum of the activations of RGB values of each channel? I'm not quite sure what the scalar loss function actually is here. I'm assuming some mean. (They discuss a few reduction operators, L_inf, L_2, in the preconditioning part but I don't think it's the same thing?)The visualizations of image gradients was really fascinating, I never really thought about plotting the gradient of each pixel channel as an image. I take it these gradients are for a particular (and same) random starting value and step size? It's not totally clear.(I have to say, "second-to-last figure.." again.. cool presentation but being able to say "figure 9" or whatever would be nice. Not everything about traditional publication needs to be thrown out the window.. figure and section numbers are useful for discussion!)

评论 #15655452 未加载

评论 #15649722 未加载

shancarter超过 7 年前

There’s also an appendix where you can browse all the layers. <a href="https://distill.pub/2017/feature-visualization/appendix/googlenet/4b.html" rel="nofollow">https://distill.pub/2017/feature-visualization/appendix/goog...</a>

评论 #15653006 未加载

chillingeffect超过 7 年前

Are the layer names the same ones referred to in this paper? <a href="https://arxiv.org/abs/1409.4842" rel="nofollow">https://arxiv.org/abs/1409.4842</a>And how can e.g. layer3a be generated from layer conv2d0? By convolving with a linear kernel? Or by the entire Inception Module including the linear and the non-linear operations?Thank you. Outstanding work breaking it down.Here's another paper people might enjoy. The author generates an example for "Saxophone," which includes a player... Which is fascinating, bc it implies that our usage of the word in real practice implies a player, even though the Saxophone is an instrument only. This highlights the difference between our denotative language and our experience of language! <a href="https://www.auduno.com/2015/07/29/visualizing-googlenet-classes/" rel="nofollow">https://www.auduno.com/2015/07/29/visualizing-googlenet-clas...</a>Also, for those curious about the DepthConcat operation, it's described here: <a href="https://stats.stackexchange.com/questions/184823/how-does-the-depthconcat-operation-in-going-deeper-with-convolutions-work" rel="nofollow">https://stats.stackexchange.com/questions/184823/how-does-th...</a>Edit: I'll be damned if there isn't something downright Jungian about these prototypes! There are snakes! Man-made objects! Shelter structures! Wheels! Animals! Sexy legs! The connection between snakes and guitar bodies is blowing my mind!

Houshalter超过 7 年前

This didn't include my favorite kind of visualization from Nguyen, et al., 2015: <a href="https://i.imgur.com/AERgy7I.png" rel="nofollow">https://i.imgur.com/AERgy7I.png</a>

评论 #15650224 未加载

aj_g超过 7 年前

Wow. That's incredible how psychedelic these images are. I'd be really curious to learn more about the link between these two seemingly distant subjects.

评论 #15647634 未加载

评论 #15649242 未加载

shellbackground超过 7 年前

This pictures reminds me about what one's can see under psychedelics. All sensory input basically begins to break down to that kind of patterns, and thus reality dissolves into nothing. This is equally terrifying and liberating depends on look. The terrifying thought is that there's no-one behind this eyes and ears. The liberating thought is that if there's no-one there, then there's no-one to die.

评论 #15652018 未加载

评论 #15652397 未加载

评论 #15650193 未加载

dandermotj超过 7 年前

Hi Chris, firstly thanks for all the work you've done publishing brilliant articles on supervised and unsupervised methods and visualisation on your old blog and now in Distill.This question isn't about feature visualisation, but I though I'd take the chance to ask you, what do you think of Hinton's latest paper and his move away from neural network architectures?

评论 #15646607 未加载

Kronopath超过 7 年前

Interesting that simple optimization ends up with high-frequency noise similar to adversarial attacks on neural nets.While I agree that the practicality of these visualizations mean that you have to fight against this high-frequency "cheating", I can't help but shake the feeling that what these optimization visualizations are showing us is correct. This is what the neuron responds to, whether you like it or not. Put in another way, the problem doesn't seem to be with the visualization but with the network itself.Has there been any research in making neural networks that are robust to adversarial examples?

评论 #15650318 未加载

评论 #15672048 未加载

hosh超过 7 年前

Cool. Reminds me a bit of <a href="https://qualiacomputing.com/2016/12/12/the-hyperbolic-geometry-of-dmt-experiences/" rel="nofollow">https://qualiacomputing.com/2016/12/12/the-hyperbolic-geomet...</a>(Though maybe not as symmetric?)

chillingeffect超过 7 年前

Is there any way to run images from a camera real-time into GoogLeNet?E.g. like if I want to scan areas around me to see if there are any perspectives in my environment that light up the "snake" neurons or the dog neurons???

评论 #15649305 未加载

评论 #15648540 未加载

评论 #15649994 未加载

评论 #15651115 未加载

dsnuh超过 7 年前

Okay...maybe a stupid question.Could they train on white noise from a television and see if the CBR shows a structure similar to the structure of the observable universe when examining the feature layers?

snippyhollow超过 7 年前

Similar to <a href="https://arxiv.org/abs/1311.2901" rel="nofollow">https://arxiv.org/abs/1311.2901</a>

gergoerdi超过 7 年前

So can someone use this to show us where the rifle is on the turtle?

评论 #15651109 未加载

nnfy超过 7 年前

Awesome, but to me this stuff is also terrifying, and I can't quite place why.Something about dissecting intelligence, and the potential that our own minds process things similarly. Creepy how our reality is distilled into these uncanny valley type matrices.Also, I suspect it says something that these images look like what people report seeing on psychedelic trips...

评论 #15646634 未加载

16 条评论

muxator超过 7 年前

评论 #15648105 未加载

评论 #15649326 未加载

评论 #15668264 未加载

colah3超过 7 年前

Hey! I'm one of the authors, along with Alex and Ludwig. We're happy to answer any questions! :)

评论 #15646999 未加载

评论 #15646389 未加载

评论 #15648215 未加载

评论 #15653856 未加载

评论 #15647205 未加载

评论 #15650702 未加载

评论 #15650697 未加载

radarsat1超过 7 年前

评论 #15655452 未加载

评论 #15649722 未加载

shancarter超过 7 年前

评论 #15653006 未加载

chillingeffect超过 7 年前

Houshalter超过 7 年前

This didn't include my favorite kind of visualization from Nguyen, et al., 2015: <a href="https://i.imgur.com/AERgy7I.png" rel="nofollow">https://i.imgur.com/AERgy7I.png</a>

评论 #15650224 未加载

aj_g超过 7 年前

Wow. That's incredible how psychedelic these images are. I'd be really curious to learn more about the link between these two seemingly distant subjects.

评论 #15647634 未加载

评论 #15649242 未加载

shellbackground超过 7 年前

评论 #15652018 未加载

评论 #15652397 未加载

评论 #15650193 未加载

dandermotj超过 7 年前

评论 #15646607 未加载

Kronopath超过 7 年前

评论 #15650318 未加载

评论 #15672048 未加载

hosh超过 7 年前

chillingeffect超过 7 年前

评论 #15649305 未加载

评论 #15648540 未加载

评论 #15649994 未加载

评论 #15651115 未加载

dsnuh超过 7 年前

snippyhollow超过 7 年前

Similar to <a href="https://arxiv.org/abs/1311.2901" rel="nofollow">https://arxiv.org/abs/1311.2901</a>