As a side note, I checked out gadersd's implementation of llama2 before. [1] I was surprised to see how clean the code is. [2]<p>[1]: <a href="https://github.com/Gadersd/llama2-burn/">https://github.com/Gadersd/llama2-burn/</a><p>[2]: <a href="https://github.com/Gadersd/llama2-burn/blob/main/src/model.rs">https://github.com/Gadersd/llama2-burn/blob/main/src/model.r...</a>