TechEcho

7 comments

stereolambdaalmost 6 years ago

It's worth noting that apparently (as I learned lately) RNNs are going slightly out of fashion because they are hard to parallelize and have trouble remembering important stuff at larger distances. Transformers are proposed as a possible solution - very roughly speaking, they use attention mechanisms instead of recurrent memory and can run in parallel.I have to say that while I understand the problems with recurrent nets (which I've used many times), I haven't yet grokked the alternatives. Here are some decently looking search results for you as starting points. Warning, these can be longer and heavier reads probably not for beginners.<a href="https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0" rel="nofollow">https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c7...</a> (there's some sensationalism here to be fair)<a href="https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/" rel="nofollow">https://mchromiak.github.io/articles/2017/Sep/12/Transformer...</a><a href="https://www.analyticsvidhya.com/blog/2019/06/understanding-transformers-nlp-state-of-the-art-models/" rel="nofollow">https://www.analyticsvidhya.com/blog/2019/06/understanding-t...</a><a href="https://www.tensorflow.org/beta/tutorials/text/transformer" rel="nofollow">https://www.tensorflow.org/beta/tutorials/text/transformer</a>That being said, I think that understanding RNNs is very beneficial conceptually and nowadays there are relatively easy to use implementations that should be pretty good for many use cases.

评论 #20529604 未加载

评论 #20531267 未加载

评论 #20528777 未加载

vzhou842almost 6 years ago

Hey, author here. Happy to answer any questions or take any suggestions.Runnable code from the article: <a href="https://repl.it/@vzhou842/A-RNN-from-scratch" rel="nofollow">https://repl.it/@vzhou842/A-RNN-from-scratch</a>

评论 #20526925 未加载

评论 #20525519 未加载

评论 #20527583 未加载

评论 #20530665 未加载

wish5031almost 6 years ago

Nice! I like that the author wrote the code by hand rather than leaning on some framework. It makes it a lot easier to connect the math to the code. :)As a meta-comment on these "Introduction to _____ neural network" articles (not just this one), I wish people would spend more time talking about when their neural net isn't the right tool for the job. SVMs, kNN, even basic regression techniques aren't any less effective than they were 20 years ago. They're easier to interpret and debug, require many fewer parameters, and potentially (you may need to apply some tricks here or there) faster at both training and evaluation time.

cheezalmost 6 years ago

This kind of article is absolutely the thing everyone new to deep learning/neural networks should read. I wish there was one for each type of algorithm.

评论 #20527647 未加载

rrggrralmost 6 years ago

Would be great if you showed the final output (eg. semantic analysis) result.

mleventalalmost 6 years ago

why do people insist on mentioning the bias terms in expository essays? it's a detail that clutters the equations. why not keep the transformations linear and then at the end make a note that you also need to shift using a bias term.

ape4almost 6 years ago

I doubt Google Translate uses RNN. They use Statistical Machine Translation. Oops, I see they switched to NN in 2016. <a href="https://en.wikipedia.org/wiki/Google_Translate" rel="nofollow">https://en.wikipedia.org/wiki/Google_Translate</a>

评论 #20526094 未加载

评论 #20530885 未加载

7 comments

stereolambdaalmost 6 years ago

评论 #20529604 未加载

评论 #20531267 未加载

评论 #20528777 未加载

vzhou842almost 6 years ago

评论 #20526925 未加载

评论 #20525519 未加载

评论 #20527583 未加载

评论 #20530665 未加载

wish5031almost 6 years ago

cheezalmost 6 years ago

This kind of article is absolutely the thing everyone new to deep learning/neural networks should read. I wish there was one for each type of algorithm.

An Introduction to Recurrent Neural Networks

7 comments

An Introduction to Recurrent Neural Networks

7 comments