The original is quite long, but quite interesting.[1] Reading it makes me feel like I did reading A Brief History of Time as a middle schooler - concepts that are mainly just out of reach, with a few flashes that I actually understand.<p>One particularly interesting topic is the "theories of superposition" section, which gets into how LLMs categorize concepts. Are concepts all distinct or indistinct? Are they independent or do they cluster? It seems that the answer is all of the above.<p>This ties into linguistic theories of categorization[2] that I saw referenced in (of all places) a book about the partition of Judaeo-Christianity in the first centuries CE.<p>Some categories have hard lines - something is a "bird" or it is not. Some categories have soft lines - like someone being "tall." Some categories work on prototypes, making them have different intensities within the space - A sparrow, swallow, or robin is more "birdy" than a chicken, emu, or turkey. Apparently Wittgenstein was the first to really explore with Family Resemblances that a category might not have hard boundaries, according to people who study these things.[3] These sorts of "manifolds" seem to appear, where some concepts are not just distinct points that are or aren't.<p>It's exciting to see that LLMs may give us insights into how our brains store concepts. I've heard people criticize them as "just predicting the next most likely token," but I've found myself lost when speaking in the middle of a garden path sentence many times. I don't know how a sentence will end before I start saying it, and it's certainly plausible that LLMs actually do match they way we speak.<p>Probably the most exciting piece is seeing how close they seem to get to mimicking how we communicate and think, while being fully limited to language with no other modeling behind it - no concept of the physical world, no understanding of counting or math, just words. It's clear when you scratch the surface that LLM outputs are bullshit with no thought underneath them, but it's amazing how much is covered by linking concepts with no logic other than how you've heard them linked before.<p>[1] <a href="https://transformer-circuits.pub/2023/monosemantic-features/index.html" rel="nofollow noreferrer">https://transformer-circuits.pub/2023/monosemantic-features/...</a><p>[2] <a href="https://www.sciencedirect.com/science/article/abs/pii/001002857690013X" rel="nofollow noreferrer">https://www.sciencedirect.com/science/article/abs/pii/001002...</a><p>[3] <a href="https://en.wikipedia.org/wiki/Family_resemblance" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Family_resemblance</a>