TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Llama 3.1 in C

212 点作者 AMICABoard10 个月前

6 条评论

danielhanchen10 个月前
Oh this is super cool! I think maybe the new RoPE scaling method Llama 3.1 uses isn&#x27;t yet added in? It&#x27;s some weird one time scaling mechanism found by a grid search to enable 128K context. Essentially the model was trained on 15.6T tokens on 8K context, then iteratively extended to 128K context with 800B tokens.<p>Can open a PR if people want :) [Edit: Just opened a PR! Apologies my C is very rusty! <a href="https:&#x2F;&#x2F;github.com&#x2F;trholding&#x2F;llama2.c&#x2F;pull&#x2F;14">https:&#x2F;&#x2F;github.com&#x2F;trholding&#x2F;llama2.c&#x2F;pull&#x2F;14</a>]<p><a href="https:&#x2F;&#x2F;github.com&#x2F;trholding&#x2F;llama2.c&#x2F;blob&#x2F;master&#x2F;runq.c#L657">https:&#x2F;&#x2F;github.com&#x2F;trholding&#x2F;llama2.c&#x2F;blob&#x2F;master&#x2F;runq.c#L65...</a> needs to be scaled with some weird formula like in <a href="https:&#x2F;&#x2F;github.com&#x2F;unslothai&#x2F;unsloth&#x2F;blob&#x2F;main&#x2F;unsloth&#x2F;models&#x2F;llama.py#L1116">https:&#x2F;&#x2F;github.com&#x2F;unslothai&#x2F;unsloth&#x2F;blob&#x2F;main&#x2F;unsloth&#x2F;model...</a>
评论 #41054652 未加载
AMICABoard10 个月前
Okay but hold your horses. Still a bit buggy.<p>Sample output:<p>Meta&#x27;s Llama 3.1 models can output multilingual text which is awesome. Here are some examples output of 8 bit quantized 8b model with 100 token output (-n 100)... Quantization creates some brain damage.<p>English<p>My cat is funny. &quot;Funny cat,&quot; I say, walking up to it. &quot;What are you up to?&quot; It sits up straight and looks at me with a tilted head, as if to say, &quot;What&#x27;s wrong with you?&quot; Sometimes I just have to laugh at how funny a cat can be. So I say, &quot;Okay, you&#x27;re funny. I&#x27;ll give you some treats.&quot; It stretches out a little and I give it some treats. It eats them up quickly and starts<p>German<p>Besitzen Sie einen Amiga 500? Wenn nicht, werden Sie wissen, dass dies ein Computer war, der im späten 1980er und frühen 1990er Jahren für Spiele verfügbar war, die für Personen mit bestimmten Körperverletzungen gedacht waren. Manchmal müssen wir uns an frühere Zeiten erinnern, die, wie wir jetzt wissen, schwierig waren. Hier ist ein Link, der meine Geschichte bespre<p>French<p>Le vin français est, à bien des égards, un vin des origines, car il a joué un rôle important dans l&#x27;histoire de la France&quot;. La réputation des vins de France repose principalement sur leurs qualités gustatives et la gestion des vignobles contrôlée, ce qui rend le vin français un &quot;produit d&#x27;exception&quot;. La France est donc leader mondial de la production de vin, avec 25 % des exportations mon<p>Thai<p>แมวของฉันตลกชอบเล่นบนม้วนกระดาษ และฉันก็ไม่แน่ใจว่าควรจะยินยอมที่จะให้เล่นหรือไม่<p>เมื่อเวลาผ่านไป ฉันเห็นว่าแมวของฉันเล่นม้วนกระดาษเป็นระยะ ๆ ฉันจึงตัดสินใจที่จะลองปรับเปลี่ยนเกมให้สนุกขึ้น<p>Hindi<p>मेरी बिल्ली बहुत मज़ाया है और वह हमेशा अपनी शारीरिक गतिविधियों से मुझे मजाक करती है। वास्तव में, जब वह अपनी खिलौनों की चपपेट में आती है तो वह विशेष रूप से क्लासिक बन जाती है। इसके अलावा, वह एक छोटी सी च
评论 #41056479 未加载
评论 #41054454 未加载
评论 #41065256 未加载
评论 #41054172 未加载
robertkoss10 个月前
As someone who has literally no idea of ModelOps &#x2F; GenAI Deployment, what am I seeing there? Code that just loads in the weights and provides an inference API? Or what does this code actually do?
评论 #41058893 未加载
v3ss0n10 个月前
How this compares to llamacpp?
评论 #41054694 未加载
hislaziness10 个月前
Cool. I will try it out. I tried the same with ollama, the non english part needs a lot more polish. Do you see the outcome being any different?
评论 #41054687 未加载
BaculumMeumEst10 个月前
I think generalizing llama2.c like this project is doing kind of defeats the purpose, no?
评论 #41057477 未加载