科技回声

> Much of the magic inside of neural network libraries has less to do with cleverer algorithms and more to do with vectorized SIMD instructions and/or being parsimonious with GPU memory usage and communication back and forth with main memory.I mean… that’s not really fair is it?We’ve been able to build NN libraries for 30 years, but it’s the transformers algorithm on top of it, and the stacked layers forming a coherent network that are the complex parts right?Implement stable diffusion in clojure (the python code for it is all open source) and we quickly see that there is a lot of complexity once you’re doing something useful that the primitive operations don’t support.It’s not really any different from opencv with the basic matrix operations and then paper-by-paper implementations of various algorithms.Building a basic pixel matrix library using clojure wouldn’t give you an equivalent to opencv either.Is there really a clear meaningful bridge between building low level operations and building high level functions out of them?When you implement sqrt, you’ve learnt a thing… but it doesn’t help you build a rendering engine.Hasn’t this always been the problem with learning ML “from scratch?”You start with basic operations, do MNIST… and then… uh, well, no. Now you clone a python repo that implements the paper you want to work on and modify it, because implementing it from scratch with your primitives isn’t really possible.

This is such a terrific write-up. It's always felt like the ML space takes years of study to get a foothold. But this is a clear path to learning critical first principles. Thanks for writing this!

Notes on neural networks from scratch in Clojure

2 条评论

Notes on neural networks from scratch in Clojure

2 条评论