TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LoRA from scratch: implementation for LLM finetuning

339 pointsby rasbtover 1 year ago

17 comments

ignoramousover 1 year ago
I&#x27;ve been keeping track of the techniques through Maxime Labonne&#x27;s LLMs 101: <a href="https:&#x2F;&#x2F;github.com&#x2F;mlabonne&#x2F;llm-course#4-supervised-fine-tuning">https:&#x2F;&#x2F;github.com&#x2F;mlabonne&#x2F;llm-course#4-supervised-fine-tun...</a>
评论 #39093419 未加载
denysvitaliover 1 year ago
LoRA != LoRa. I keep on getting confused and hate that they chose to reuse an existing acronym
评论 #39094920 未加载
评论 #39095071 未加载
评论 #39098168 未加载
评论 #39098003 未加载
评论 #39094385 未加载
rsweeney21over 1 year ago
It&#x27;s still strange to me to work in a field of computer science where we say things like &quot;we&#x27;re not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best.&quot;
评论 #39095542 未加载
评论 #39094333 未加载
评论 #39095596 未加载
评论 #39095305 未加载
评论 #39095128 未加载
评论 #39096747 未加载
评论 #39096341 未加载
评论 #39094022 未加载
评论 #39099613 未加载
评论 #39102813 未加载
评论 #39095660 未加载
评论 #39101470 未加载
评论 #39095644 未加载
评论 #39094676 未加载
chenxi9649over 1 year ago
It&#x27;s still not too clear to me when we should fine tune versus RAG.<p>In the past, I used to believe that finetuning is mostly for model behavioral change, but recently it seems that certain companies are also using fine-tuning for knowledge addition.<p>What are the main use cases for fine tuning?
评论 #39093975 未加载
评论 #39094794 未加载
评论 #39094514 未加载
评论 #39099489 未加载
somethingsomeover 1 year ago
Nice article, I&#x27;m not in this field, however, my understanding of the original paper was that the LoRA was applied only on the last dense layer, and not to all independently (maybe I misread it originally).<p>Digging a bit in why the implementation is like this in the link, I found that in QLoRA they used this and it seems to have some interesting effects, maybe adding a note on the QLoRA decision would be nice :)<p>I&#x27;m not sure I understand why it works though, my neophyte view was that applying LoRA to the last layer made sense, but, I do not wrap my mind on the rationale of applying it repeadly to each linear layer. Can someone explain their intuition?
评论 #39095840 未加载
jamesblondeover 1 year ago
I prefer the not from scratch, but from configuration approach by Axolotl. Aolotl supports fine-tuning mistral, llama-2, with lots of the latest techniques - sample packing, flash attention, xformers.<p>I concentrate on collecting and curating the fine-tuning data, do &quot;data-centric&quot; fine-tuning - not learning LoRA from scratch.
评论 #39096466 未加载
yandrypozoover 1 year ago
gotta say naming is hard I thought this was about LoRa (from &quot;long range&quot;) or LoRaWAN, the IoT sensors communication.
helloericsfover 1 year ago
HN friends, What are the most popular libraries for fine-tuning? (Not from scratch)
评论 #39098471 未加载
broabprobeover 1 year ago
wow definitely thought this was about LoRa at first.
facu17yover 1 year ago
What&#x27;s the performance penalty of LoRA?
评论 #39095122 未加载
huqedatoover 1 year ago
Excellent and practical example! I&#x27;m curious if there&#x27;s a comparable one using Julia or JavaScript.
fnordfnordfnordover 1 year ago
I thought this was going to be some neat software defined radio stuff. Still quite interesting though.
评论 #39098985 未加载
tussaover 1 year ago
It&#x27;s cheap and sleazy to steal a name from another project to ride it&#x27;s fame.
andy99over 1 year ago
&quot;From scratch&quot; seems to be a matter of opinion. &quot;Pure pytorch&quot; maybe, except it uses HF transformers. So it&#x27;s LoRA on top of common frameworks...
评论 #39093000 未加载
评论 #39100334 未加载
评论 #39093036 未加载
评论 #39093206 未加载
dymkover 1 year ago
Not to be confused with LoRa (&quot;long range&quot;), a radio communication protocol. At first I thought this could be about using LLMs to find optimal protocol parameters, but alas.
评论 #39092814 未加载
评论 #39092751 未加载
评论 #39093513 未加载
评论 #39093087 未加载
评论 #39092856 未加载
评论 #39093173 未加载
评论 #39093195 未加载
ijhuygft776over 1 year ago
I wish the wireless LoRa protocol would be open source...
评论 #39106875 未加载
gourabmiover 1 year ago
Someone somewhere is already working on naming their project Lehsun.. &#x2F;s