TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

LoRA from scratch: implementation for LLM finetuning

339 点作者 rasbt超过 1 年前

17 条评论

ignoramous超过 1 年前
I&#x27;ve been keeping track of the techniques through Maxime Labonne&#x27;s LLMs 101: <a href="https:&#x2F;&#x2F;github.com&#x2F;mlabonne&#x2F;llm-course#4-supervised-fine-tuning">https:&#x2F;&#x2F;github.com&#x2F;mlabonne&#x2F;llm-course#4-supervised-fine-tun...</a>
评论 #39093419 未加载
denysvitali超过 1 年前
LoRA != LoRa. I keep on getting confused and hate that they chose to reuse an existing acronym
评论 #39094920 未加载
评论 #39095071 未加载
评论 #39098168 未加载
评论 #39098003 未加载
评论 #39094385 未加载
rsweeney21超过 1 年前
It&#x27;s still strange to me to work in a field of computer science where we say things like &quot;we&#x27;re not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best.&quot;
评论 #39095542 未加载
评论 #39094333 未加载
评论 #39095596 未加载
评论 #39095305 未加载
评论 #39095128 未加载
评论 #39096747 未加载
评论 #39096341 未加载
评论 #39094022 未加载
评论 #39099613 未加载
评论 #39102813 未加载
评论 #39095660 未加载
评论 #39101470 未加载
评论 #39095644 未加载
评论 #39094676 未加载
chenxi9649超过 1 年前
It&#x27;s still not too clear to me when we should fine tune versus RAG.<p>In the past, I used to believe that finetuning is mostly for model behavioral change, but recently it seems that certain companies are also using fine-tuning for knowledge addition.<p>What are the main use cases for fine tuning?
评论 #39093975 未加载
评论 #39094794 未加载
评论 #39094514 未加载
评论 #39099489 未加载
somethingsome超过 1 年前
Nice article, I&#x27;m not in this field, however, my understanding of the original paper was that the LoRA was applied only on the last dense layer, and not to all independently (maybe I misread it originally).<p>Digging a bit in why the implementation is like this in the link, I found that in QLoRA they used this and it seems to have some interesting effects, maybe adding a note on the QLoRA decision would be nice :)<p>I&#x27;m not sure I understand why it works though, my neophyte view was that applying LoRA to the last layer made sense, but, I do not wrap my mind on the rationale of applying it repeadly to each linear layer. Can someone explain their intuition?
评论 #39095840 未加载
jamesblonde超过 1 年前
I prefer the not from scratch, but from configuration approach by Axolotl. Aolotl supports fine-tuning mistral, llama-2, with lots of the latest techniques - sample packing, flash attention, xformers.<p>I concentrate on collecting and curating the fine-tuning data, do &quot;data-centric&quot; fine-tuning - not learning LoRA from scratch.
评论 #39096466 未加载
yandrypozo超过 1 年前
gotta say naming is hard I thought this was about LoRa (from &quot;long range&quot;) or LoRaWAN, the IoT sensors communication.
helloericsf超过 1 年前
HN friends, What are the most popular libraries for fine-tuning? (Not from scratch)
评论 #39098471 未加载
broabprobe超过 1 年前
wow definitely thought this was about LoRa at first.
facu17y超过 1 年前
What&#x27;s the performance penalty of LoRA?
评论 #39095122 未加载
huqedato超过 1 年前
Excellent and practical example! I&#x27;m curious if there&#x27;s a comparable one using Julia or JavaScript.
fnordfnordfnord超过 1 年前
I thought this was going to be some neat software defined radio stuff. Still quite interesting though.
评论 #39098985 未加载
tussa超过 1 年前
It&#x27;s cheap and sleazy to steal a name from another project to ride it&#x27;s fame.
andy99超过 1 年前
&quot;From scratch&quot; seems to be a matter of opinion. &quot;Pure pytorch&quot; maybe, except it uses HF transformers. So it&#x27;s LoRA on top of common frameworks...
评论 #39093000 未加载
评论 #39100334 未加载
评论 #39093036 未加载
评论 #39093206 未加载
dymk超过 1 年前
Not to be confused with LoRa (&quot;long range&quot;), a radio communication protocol. At first I thought this could be about using LLMs to find optimal protocol parameters, but alas.
评论 #39092814 未加载
评论 #39092751 未加载
评论 #39093513 未加载
评论 #39093087 未加载
评论 #39092856 未加载
评论 #39093173 未加载
评论 #39093195 未加载
ijhuygft776超过 1 年前
I wish the wireless LoRa protocol would be open source...
评论 #39106875 未加载
gourabmi超过 1 年前
Someone somewhere is already working on naming their project Lehsun.. &#x2F;s