"What if the plasticity of the connections was under the control of the network itself, as it seems to be in biological brains through the influence of neuromodulators?"<p>Anyone who wishes to explore this idea would do well to go back to the basics of neural nets and read Warren McCulloch's seminal papers on neural nets, from the 40s:<p><a href="http://www.cse.chalmers.se/~coquand/AUTOMATA/mcp.pdf" rel="nofollow">http://www.cse.chalmers.se/~coquand/AUTOMATA/mcp.pdf</a> A Logical Calculus of the ideas immanent in nervous activity<p><a href="http://vordenker.de/ggphilosophy/mcculloch_heterarchy.pdf" rel="nofollow">http://vordenker.de/ggphilosophy/mcculloch_heterarchy.pdf</a> A heterarchy of values determined by the topology of neural nets<p>(After having read those two papers, one can then try to make sense of Heinz von Förster's masterpiece, <a href="http://www.univie.ac.at/constructivism/archive/fulltexts/1270.pdf" rel="nofollow">http://www.univie.ac.at/constructivism/archive/fulltexts/127...</a>, Objects: Tokens for (Eigen-)Behaviors, which also bears some relevance to this matter. However, most people find it incomprehensible.)
Interesting. Some highlighted links from the writeup:<p>1. <i>Differentiable plasticity: training plastic neural networks with backpropagation</i>
(<a href="https://arxiv.org/abs/1804.02464" rel="nofollow">https://arxiv.org/abs/1804.02464</a>)<p>2. <i>Born to Learn: the Inspiration, Progress, and Future of Evolved Plastic Artificial Neural Networks</i>
(<a href="https://arxiv.org/abs/1703.10371" rel="nofollow">https://arxiv.org/abs/1703.10371</a>)<p>3. Github for the project: <a href="https://github.com/uber-common/differentiable-plasticity" rel="nofollow">https://github.com/uber-common/differentiable-plasticity</a><p>4. <i>Learning to Learn</i> (<a href="http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/" rel="nofollow">http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/</a>)<p>5. Meta-Learning: <a href="http://metalearning.ml/" rel="nofollow">http://metalearning.ml/</a>
Very cool. It's interesting how powerful the recurrent network becomes with the addition of the learned hebbian term. For context, even without the Hebbian term, recurrent networks can learn to learn to do quite interesting things (Hochreiter et al. 2001).<p>Shameless plug -- our lab recently ported LSTMs to spiking networks without a significant loss in performance, and showed that learning to learn works quite well even with spiking networks (Bellec et al. 2018).<p>So it seems like this method of learning to learn could provide a extremely biologically realistic and fundamental paradigm for fast learning. The addition of the Hebbian term neatly fits in with this paradigm too.<p>Hochreiter et al. 2001: <a href="http://link.springer.com/chapter/10.1007/3-540-44668-0_13" rel="nofollow">http://link.springer.com/chapter/10.1007/3-540-44668-0_13</a><p>Bellec et al. 2018: <a href="https://arxiv.org/abs/1803.09574" rel="nofollow">https://arxiv.org/abs/1803.09574</a>
It’d be interesting to compare this approach against a simpler baseline: setting a <i>different</i> (10 – 100 times higher?) learning rate for a <i>fraction</i> (10% ?) of neurons in an LSTM.
Is the plasticity update guaranteed to reach equilibrium assuming the network is run on iid data (as in do the H_{ij} values reach a fixed point)?<p>Edit: Seems like it should be reached eventually as the equilibrium point is H_ij = y_i * y_j and they keep doing a weighted average of the former with the latter (this is not a proof ofc as y_i * y_j keeps changing with each sample).
So the "plastic component" of a connection strength is a thing which decays away exponentially, but is replenished whenever the two endpoints do the same thing.<p>I have heard that neuroscientists have an adage "fire together wire together". Is that all that ML people mean by "plasticity".
Good luck with getting even more suboptimal solutions with this extra non linearity.<p>No wonder why when your autonomous cars are plowing into people or walls you have no clue of what is going on.