Hmm, might be just me, this feels like a refresher for people who already understand NN and transformers. This will probably escape most devs. I've had a bit better luck with the fastai course which is a series of YouTube videos, so it's a slower pace but explained quite well without requiring a lot of understanding.
One nice thing about the 1986 Hinton paper was that he described the equations very explicitly in a way that even a math dummy like me could implement.<p><a href="https://github.com/runvnc/mlp/blob/master/neuralnetwork.cpp">https://github.com/runvnc/mlp/blob/master/neuralnetwork.cpp</a><p><a href="https://github.com/runvnc/nnpapers/blob/master/hinton86.pdf">https://github.com/runvnc/nnpapers/blob/master/hinton86.pdf</a><p>This article is also a very good explanation.
I think the original title was better: "Bridging the gap between neural networks and functions"<p>It discusses the standard backpropagation optimization method in differential form and the functional approximation of neural networks, but doesn't discuss transformers at all that I could tell. I think the code might be helpful to some in understanding implementation, but so much is now done in accelerators that it doesn't really capture real implementations.