From the google leaks paper:<p>'LoRA is an incredibly powerful technique we should probably be paying more attention to<p>LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.' [1]<p>[1] <a href="https://www.semianalysis.com/p/google-we-have-no-moat-and-neither" rel="nofollow">https://www.semianalysis.com/p/google-we-have-no-moat-and-ne...</a>