It's worth noting that apparently (as I learned lately) RNNs are going slightly out of fashion because they are hard to parallelize and have trouble remembering important stuff at larger distances. Transformers are proposed as a possible solution - very roughly speaking, they use attention mechanisms instead of recurrent memory and can run in parallel.<p>I have to say that while I understand the problems with recurrent nets (which I've used many times), I haven't yet grokked the alternatives. Here are some decently looking search results for you as starting points. Warning, these can be longer and heavier reads probably not for beginners.<p><a href="https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0" rel="nofollow">https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c7...</a> (there's some sensationalism here to be fair)<p><a href="https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/" rel="nofollow">https://mchromiak.github.io/articles/2017/Sep/12/Transformer...</a><p><a href="https://www.analyticsvidhya.com/blog/2019/06/understanding-transformers-nlp-state-of-the-art-models/" rel="nofollow">https://www.analyticsvidhya.com/blog/2019/06/understanding-t...</a><p><a href="https://www.tensorflow.org/beta/tutorials/text/transformer" rel="nofollow">https://www.tensorflow.org/beta/tutorials/text/transformer</a><p>That being said, I think that understanding RNNs is very beneficial conceptually and nowadays there are relatively easy to use implementations that should be pretty good for many use cases.
Hey, author here. Happy to answer any questions or take any suggestions.<p>Runnable code from the article: <a href="https://repl.it/@vzhou842/A-RNN-from-scratch" rel="nofollow">https://repl.it/@vzhou842/A-RNN-from-scratch</a>
Nice! I like that the author wrote the code by hand rather than leaning on some framework. It makes it a lot easier to connect the math to the code. :)<p>As a meta-comment on these "Introduction to _____ neural network" articles (not just this one), I wish people would spend more time talking about when their neural net isn't the right tool for the job. SVMs, kNN, even basic regression techniques aren't any less effective than they were 20 years ago. They're easier to interpret and debug, require many fewer parameters, and potentially (you may need to apply some tricks here or there) faster at both training and evaluation time.
This kind of article is absolutely the thing everyone new to deep learning/neural networks should read. I wish there was one for each type of algorithm.
why do people insist on mentioning the bias terms in expository essays? it's a detail that clutters the equations. why not keep the transformations linear and then at the end make a note that you also need to shift using a bias term.
I doubt Google Translate uses RNN. They use Statistical Machine Translation. Oops, I see they switched to NN in 2016.
<a href="https://en.wikipedia.org/wiki/Google_Translate" rel="nofollow">https://en.wikipedia.org/wiki/Google_Translate</a>