So SpinQuant learns a rotation for activations and weights that, to my understanding, "smear" the outlier weights out so you don't get extreme values in any one weight.<p>Random anecdote warning - In the old days, before vector search became AI and everyone and their dog offered a vector database, I had a task that required nearest neighbour search in a decent amount of high-dimensional vectors.<p>I tried quantizing them to bit vectors in an index and scanning through it to get an initial set of candidates.
Performance was actually quite decent - reading through RAM linearly is fast! But the selectivity wasn't great.<p>Somewhere along the way I found this paper[1] that iteratively finds a rotation to apply before quantization to reduce the quantization error. Very similar goal to SpinQuant, but focused on bit quantization only.<p>As it turns out the 'random rotation' baseline they benchmark against worked great for my use case, so I never tried implementing the fancier algorithm. But it's a pretty rare day at work that "apply a random rotation matrix to a 128-dimensional vector" is the solution to my problem.<p>[1] <a href="https://ieeexplore.ieee.org/abstract/document/6296665" rel="nofollow">https://ieeexplore.ieee.org/abstract/document/6296665</a> / <a href="https://slazebni.cs.illinois.edu/publications/ITQ.pdf" rel="nofollow">https://slazebni.cs.illinois.edu/publications/ITQ.pdf</a>