Off topic: can anyone tell me what the author has used to convert their paper into this beautiful webpage? It's apparently a trend with ML papers to have their own website.
If you're reading this, you may be interested in my other work on "Exploring Llama-3 MLP Neurons" as well.<p><a href="https://neuralblog.github.io/llama3-neurons/" rel="nofollow">https://neuralblog.github.io/llama3-neurons/</a>
Tangent, the screen reader recording is cool. The main issue for me is that it doesn’t know where I am on the page, so it just starts from the beginning and I need to guess the position of the relevant audio based on how far down on the page I am. It would be cool to have some sort of bookmarks that match both the text and the audio, so that I can skip both to the same section at the same time
Okay so what stood out to me here is the math work: namely, if we can identify the parts of the model which are doing the math... Can that be "augmented" to generalise it? Could you take over the math neurons and either optimise them by hand or plug in a software "implant" to do the work properly?