I have watched so many YouTube videos on this and no one seems to be able to explain it properly.<p>Each explanation is so dramatically different from one another as well.<p>I feel like its another infamously difficult to explain topic like "monads".<p>I am desperately waiting for a 3Blue1Brown video on transformers to hopefully resolve this ambiguity.<p>I am looking for a visual intuition, and something that tries to answer common questions and ambiguities that arise, and explains the history and why we do things this way.<p>The best approach I found currently is Serrano.Academy https://www.youtube.com/watch?v=UPtG_38Oq8o&pp=ygUUdHJhbnNmb3JtZXIgbmV0d29ya3M%3D. They try to visualize things in 2 dimensions with examples and show the linear transformations.<p>Karpathy had a unique way of conceptualizing it as a directed graph with a "communication phase" which further confused me.<p>For such a historic topic, I think we need a better explanation!