New technique would reveal the basis for machine-learning systems’ decisions

115 pointsby renafowlerover 8 years ago

9 comments

>> Neural networks are so called because they mimic — approximately — the structure of the brain.Grumble.ANNs are "approximately" like the brain [1] as much as Pong is "approximately" like the game of Tennis. In fact, much less so.ANNs are algorithms for optimising systems of functions. The "neurons" are functions, their "synapses" are inputs and outputs to the functions. That's an "approximation" of a brain only in the most vague sense, in the broadest possible strokes, so broad in fact that you could be approximating any physical process or object [2].Like, oh, I dunno- trains.Trains, right? The functions are like coaches and the parameters they pass between each other are like rails. Artificial Neural Networks --> Artificial Train Networks; they mimic - approximately - the structure of the train.Stop the madness. They're nothing like brains, in any way, shape or form.And grumble some more._____________[1] Wait- which brain? Mammalian brain? Primate brain? Human brain? Grown-up brain? Mathematician's brain? Axe-murderer's brain?[2] Because... that's what they do, right? They approximate physical processes.

评论 #12821557 未加载

评论 #12821205 未加载

评论 #12821201 未加载

评论 #12822208 未加载

评论 #12824082 未加载

评论 #12821257 未加载

评论 #12823605 未加载

wcrichtonover 8 years ago

Title is a little general. This specifically is a technique for breaking down text analysis, where the goal is to give semantic meaning to a block of text. In their example, they want to condense beer reviews into star ratings of a few categories. A totally black box technique would take the review and spit out the scores, whereas their technique has two jointly trained networks: one identifies relevant text fragments for each category, and the other gets the corresponding category score for the fragment.This is not groundbreaking, but still a good example of a larger trend in trying to understand neural network decision making. Here's a cool paper that analyzes how CNNs can learn image features for attributes like "fuzziness" and other higher level visual constructs while training for object recognition: <a href="https://pdfs.semanticscholar.org/3b31/9645bfdc67da7d02db766e17a3e0a37be47b.pdf" rel="nofollow">https://pdfs.semanticscholar.org/3b31/9645bfdc67da7d02db766e...</a>

评论 #12823997 未加载

评论 #12822628 未加载

avivoover 8 years ago

The general concept: we can figure out what lead to a particular classification for an item by finding smaller subsets of the item that still gave the same classification.For example, this can show which snippet of text implies a particular review should be classified as "very negative", or which part of an image lead to a classification of "cancerous" for a biopsy image.This doesn't give you much predictive power about the network however, or tell you how it actually works in general. It simply tells you how it made a particular classification.Paper link: <a href="https://people.csail.mit.edu/taolei/papers/emnlp16_rationale.pdf" rel="nofollow">https://people.csail.mit.edu/taolei/papers/emnlp16_rationale...</a>

评论 #12820386 未加载

intro-bover 8 years ago

any ideas on how this is similar to or differs from the prospects and structure of Darpa's explainable AI (XAI) contract?<a href="https://www.fbo.gov/index?s=opportunity&mode=form&id=1606a253407e8773bdd1a9e884cc5293" rel="nofollow">https://www.fbo.gov/index?s=opportunity&mode=form&id=1606a25...</a>curious about the inherent trade-off between predictive power/complexity in ML model and the accuracy of system explanation's inferred by these models

评论 #12823637 未加载

Houshalterover 8 years ago

So this is pretty limited. It only works on text data, and just picks the part of the text that most determines the output. The basic idea of doing this isn't new. You can easily ask a naive bayes spam filter what words it thinks are the highest evidence of spam or not spam in a document. It is interesting to see this done with neural nets though. I recall reading something similar that uses gradients to find what words changed the network's output the most, I'm not sure if this new method is actually better and it's not as general.But this is a long way away from the NN being able to give understandable reasons for its decisions. These methods will always be limited to pointing to a part of the input and saying "that part seemed relevant". But it can never articulate why it's relevant, or what it's "thinking" internally.I think this is ok though. I mean looking at what features the model is using to make predictions is pretty useful and should give you a rough idea how it works.I've wondered in the past if neural networks could train humans to understand them. The human would be shown an input, and try to predict the value of a neuron in the network. So the human would learn what the network has learned, and gain intuition about the inner workings of the model.You can also do a similar process with other machine learning methods. You can train a simpler, more understandable model, like decision trees, to predict the neurons of a neural net. And then the human can study that. You can even train a smaller neural network to fit to a bigger, more complex one.

hyperion2010over 8 years ago

I'm still curious about whether it is possible at all to deal with "who sunk the boat" or "the straw that broke the camel's back" cases or whether there will just end up being a list of a thousand reasons that summed up to a decision, perhaps ranked by their weight.

quantum_stateover 8 years ago

Noting that neural nets are associative memories, one would think the best one can get out for the rationale of a specific outcome would be the high level land marks of the dynamics, starting from a given input and all the way to the outcome ... is there anything else one would expect to get out from the system?

godmodusover 8 years ago

the name neural network is unfortunate. but then again you can't call them "continuous function best fit support vector feedback machines on cocaine"-networks.the name will cause confusions for many years to come. but it's what we have. nice article, it's good they're trying.

jbclementsover 8 years ago

Yes! I wrote about this in a widely-discussed ^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H totally-unread blog post back in 2014: <a href="https://www.brinckerhoff.org/blog/2014/12/12/the-why-button/" rel="nofollow">https://www.brinckerhoff.org/blog/2014/12/12/the-why-button/</a>