Deep or Shallow, NLP is breaking out

107 点作者 samiur1204大约 9 年前

8 条评论

>> "Most of our reasoning is by analogy; it's not logical reasoning."I keep reading this quote, showcased in the article, again and again, and I still cant believe that people are actually proposing this.They're basically advocating that we should abandon logic, stop trying to reason about things, and instead try stuff at random, and why? Because computers have been shown to do alright with that approach at tasks in which humans excel.It is clear as daylight that whatever humans do, computers don't - otherwise, you'd need to train your baby with the Brown Corpus before it could figure out you're its mama. We have managed to overcome the limitations of our primitive computational technology with some clever tricks and that's amazing.But to take that rightly celebrated fact and make it into an argument that we must now become ourselves as dumb as our computers, and also never try to make the poor things intelligent in the same way we are, that's ... well, it's dumb. That's what it is.

评论 #11299760 未加载

评论 #11299640 未加载

评论 #11299712 未加载

评论 #11299538 未加载

评论 #11300151 未加载

评论 #11300302 未加载

评论 #11300945 未加载

评论 #11299738 未加载

评论 #11299665 未加载

评论 #11299742 未加载

评论 #11302962 未加载

n0us大约 9 年前

As research in NLP advances one of the things I am looking out for is how the field will be broken into separate problems. Are there some problems which are relatively low hanging fruit and others that we will be scratching our heads over 30 years from now?One thing that I think will be challenging is that language has observer depending meaning. The same statement might have a completely different meaning to someone with a different experience, or made in a different context, or stated by a different person. Games like Chess and Go have observer independent solutions. The winner is the same no matter who/what plays the game.Determining the meaning of a sentence is a problem where the real answer depends on observer dependent perspective and therefore will need a completely different way to measure success compared to more 'mathematical' tasks like Go. Trying to program a machine to account for this kind of personal experience that humans have, as well as for individual differences between people will be quite challenging I think. I also think that the most significant advances will come from cross cutting academic disciplines like Psychology, Linguistics, and Philosophy of Language.

评论 #11298498 未加载

评论 #11299090 未加载

评论 #11299697 未加载

评论 #11298322 未加载

评论 #11300422 未加载

评论 #11298326 未加载

xigency大约 9 年前

In terms of shallow natural language processing, some pretty simple observations can lead to a lot of bang for the buck.One project I completed was able to really excel at keyword matching simply by building a huge dictionary of words, in a literal sense: a dictionary of contextually relevant words to a phrase, generated by very large texts.I think baby steps are the key to getting further with NLP. For reference: <a href="http://nlp.stanford.edu/fsnlp/" rel="nofollow">http://nlp.stanford.edu/fsnlp/</a>

评论 #11298712 未加载

YeGoblynQueenne大约 9 年前

In a recent interview with Communications, Hinton said his own research on word vectors goes back to the mid-1980s, when he, David Rumelhart, and Ronald Williams published work in Nature that demonstrated family relationships as vectors. "The vectors were only six components long because computers were very small then, but it took a long time for it to catch on," Hinton said.Yeah, I know the work he's talking about. It's the one related to this dataset:<a href="https://archive.ics.uci.edu/ml/datasets/Kinship" rel="nofollow">https://archive.ics.uci.edu/ml/datasets/Kinship</a>From that page:Creator:Geoff HintonDonor:J. Ross QuinlanData Set Information:This relational database consists of 24 unique names in two families (they have equivalent structures). Hinton used one unique output unit for each person and was interested in predicting the following relations: wife, husband, mother, father, daughter, son, sister, brother, aunt, uncle, niece, and nephew. Hinton used 104 input-output vector pairs (from a space of 12x24=288 possible pairs). The prediction task is as follows: given a name and a relation, have the outputs be on for only those individuals (among the 24) that satisfy the relation. The outputs for all other individuals should be off.Hinton's results: Using 100 vectors as input and 4 for testing, his results on two passes yielded 7 correct responses out of 8. His network of 36 input units, 3 layers of hidden units, and 24 output units used 500 sweeps of the training set during training.Quinlan's results: Using FOIL, he repeated the experiment 20 times (rather than Hinton's 2 times). FOIL was correct 78 out of 80 times on the test cases.And yet, if you have a wee look at Hinton's publication on Rexa, there's 43 citations, while there's a single one on Quinlan's (from Muggleton, duh).So, you know, maybe it's not logic and reasoning that's the problem here, rather a certain tendency to drum up results of neural models even when they don't do any better than other techniques.But, really, it doesn't matter. Google has the airwaves (so to speak). No matter what happens anywhere else, in academia or business, their stuff is going to be publicised the most and that's what we all have to deal with.

评论 #11299437 未加载

评论 #11301286 未加载

nkurz大约 9 年前

Just as "Hello, World" may be the best-known general programming introductory example, Mikolov, who was then at Microsoft Research, also introduced what fast became a benchmark equation in natural language processing at the 2013 proceedings of the North American Association for Computational Linguistics, the kingman+woman=queen analogy, in which the computer solved the equation spontaneously.Since the ACM has professional editors, I was surprised that they would twice misrepresent the example "king – man + woman = queen" as "kingman+woman=queen" in the article. At least they spelled "Hello, World" right, even if they couldn't bring themselves to add the "!".It looks like the problem is that in the PDF version, the phrase happens to be hyphenated at the "minus sign" in both usages (<a href="http://delivery.acm.org/10.1145/2880000/2874915/p13-goth.pdf" rel="nofollow">http://delivery.acm.org/10.1145/2880000/2874915/p13-goth.pdf</a>) [1] although one might hope this is something an editor would have checked.[1] Looks like the ACM wants to you click on the PDF link yourself, from the "View As" bar in the text version.

senthil_rajasek大约 9 年前

link to google cache<a href="http://webcache.googleusercontent.com/search?q=cache:145V9qmKz2gJ:cacm.acm.org/magazines/2016/3/198856-deep-or-shallow-nlp-is-breaking-out/fulltext+&cd=1&hl=en&ct=clnk&gl=us" rel="nofollow">http://webcache.googleusercontent.com/search?q=cache:145V9qm...</a>

YeGoblynQueenne大约 9 年前

>> reasoning by analogy is the core kind of reasoning we do, and logic is just a sort of superficial thing on top of it that happens much laterWell, for some that's certainly the case.

meeper16大约 9 年前

More info on vectors <a href="https://www.kaggle.com/c/word2vec-nlp-tutorial/forums/t/12349/word2vec-is-based-on-an-approach-from-lawrence-berkeley-national-lab" rel="nofollow">https://www.kaggle.com/c/word2vec-nlp-tutorial/forums/t/1234...</a>

8 条评论

YeGoblynQueenne大约 9 年前

评论 #11299760 未加载

评论 #11299640 未加载

评论 #11299712 未加载

评论 #11299538 未加载

评论 #11300151 未加载

评论 #11300302 未加载

评论 #11300945 未加载

评论 #11299738 未加载

评论 #11299665 未加载

评论 #11299742 未加载

评论 #11302962 未加载

n0us大约 9 年前

评论 #11298498 未加载

评论 #11299090 未加载

评论 #11299697 未加载

评论 #11298322 未加载

评论 #11300422 未加载

评论 #11298326 未加载

xigency大约 9 年前

评论 #11298712 未加载

YeGoblynQueenne大约 9 年前

评论 #11299437 未加载

评论 #11301286 未加载

nkurz大约 9 年前

senthil_rajasek大约 9 年前

YeGoblynQueenne大约 9 年前

>> reasoning by analogy is the core kind of reasoning we do, and logic is just a sort of superficial thing on top of it that happens much laterWell, for some that's certainly the case.

meeper16大约 9 年前