TechEcho

7 comments

The other recent improvement suggested for LoRA is DoRA: <a href="https://magazine.sebastianraschka.com/p/lora-and-dora-from-scratch" rel="nofollow">https://magazine.sebastianraschka.com/p/lora-and-dora-from-s...</a>. It really does seem to strongly outperform LoRA - see also <a href="https://www.answer.ai/posts/2024-04-26-fsdp-qdora-llama3.html" rel="nofollow">https://www.answer.ai/posts/2024-04-26-fsdp-qdora-llama3.htm...</a>

评论 #40189499 未加载

评论 #40189472 未加载

评论 #40189647 未加载

评论 #40192619 未加载

cuuupidabout 1 year ago

I’m struggling to understand from this paper whether the approach is better in the general sense (all cases, with wider models seeing greater benefits) or purely for wider models (with narrower models seeing detriment)?<p>If it’s the former this could effectively halve finetuning cost overnight which would go a significant way towards enabling a wider array of use cases for LoRA.

ironboundabout 1 year ago

I've had sucess with GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection <a href="https://arxiv.org/abs/2403.03507" rel="nofollow">https://arxiv.org/abs/2403.03507</a>

评论 #40194836 未加载

youssefabdelmabout 1 year ago

A better name would've probably been FastLoRA or something

评论 #40191477 未加载

评论 #40189791 未加载

yau8edq12iabout 1 year ago

What an unfortunate name... I initially thought this was about wireless communication. <a href="https://en.wikipedia.org/wiki/LoRa" rel="nofollow">https://en.wikipedia.org/wiki/LoRa</a>

评论 #40189098 未加载

评论 #40189258 未加载

评论 #40189190 未加载

评论 #40188920 未加载

评论 #40189002 未加载

allpacaabout 1 year ago

This is old, having been released in February... Why do you talk about it now?

axpy906about 1 year ago

In 2024 are folks still swapping out LoRA adapters? Is this still relevant?

评论 #40189366 未加载

评论 #40189356 未加载

7 comments

batterseapowerabout 1 year ago

评论 #40189499 未加载

评论 #40189472 未加载

评论 #40189647 未加载

评论 #40192619 未加载

cuuupidabout 1 year ago

ironboundabout 1 year ago

I've had sucess with GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection <a href="https://arxiv.org/abs/2403.03507" rel="nofollow">https://arxiv.org/abs/2403.03507</a>

评论 #40194836 未加载

youssefabdelmabout 1 year ago

A better name would've probably been FastLoRA or something

评论 #40191477 未加载

评论 #40189791 未加载

yau8edq12iabout 1 year ago

What an unfortunate name... I initially thought this was about wireless communication. <a href="https://en.wikipedia.org/wiki/LoRa" rel="nofollow">https://en.wikipedia.org/wiki/LoRa</a>

LoRA+: Efficient Low Rank Adaptation of Large Models

7 comments

LoRA+: Efficient Low Rank Adaptation of Large Models

7 comments