科技回声

3 条评论

nicklo大约 9 年前

MetaMind papers are always pretty awesome. Couple of great highlights from this paper:The visual saliency maps in Figure 6 are astounding and make their performance on this task even more impressive as they give a lot of insight into what the model is doing and it seems to be focusing on the things in the image that a regular person would use to decide on the answer. Most striking was on the question "is this in the wild?", and the saliency was on the artificial, human structures in the background that indicated it was in a zoo. This type of reasoning is surprising as it requires a bit of a reversal in logic to come up with this way of answering the question. Super impressive.The proposed input fusion layer is pretty cool - allowing information from future sentences to be used to condition how to processes previous sentences. This type of information combining was previously not really explored, and it makes sense that it improves the performance on the bAbI-10k task so much as back-tracking is an important tool for human reading comprehension. Also its clever that they encode each sentence as a thought vector before compositing so they can be processed both forwards and backwards with shared parameters- doing so on just one-hot words or even ordered word embeddings would require two vastly different parameters since grammar is wildly different when a sentence is reversed.Lastly, on a side note, if 2014 was the year of CNN's, 2015 the year of RNN's, it looks like 2016 is the year of Neural Attention Mechanisms. Excited to see what new layers are explored in 2016 that will dominate 2017.

评论 #11237432 未加载

ogrisel大约 9 年前

Out of curiosity, is this work being submitted for a conference or journal?Also have you tried to run DMN on the Text QA / Reading comprehension datasets from DeepMind?Teaching Machines to Read and Comprehend <a href="https://github.com/deepmind/rc-data" rel="nofollow">https://github.com/deepmind/rc-data</a>

inlineint大约 9 年前

It's fascinating.Could anybody answer what programming language is most likely to be used for development of this kind of systems? I haven't found information about it in the paper.

评论 #11237706 未加载

评论 #11237650 未加载

3 条评论

nicklo大约 9 年前

评论 #11237432 未加载

ogrisel大约 9 年前

inlineint大约 9 年前

It's fascinating.Could anybody answer what programming language is most likely to be used for development of this kind of systems? I haven't found information about it in the paper.

评论 #11237706 未加载

评论 #11237650 未加载

Dynamic Memory Networks for Visual and Textual Question Answering

3 条评论

Dynamic Memory Networks for Visual and Textual Question Answering

3 条评论