科技回声

6 条评论

Oh this is super cool! I think maybe the new RoPE scaling method Llama 3.1 uses isn't yet added in? It's some weird one time scaling mechanism found by a grid search to enable 128K context. Essentially the model was trained on 15.6T tokens on 8K context, then iteratively extended to 128K context with 800B tokens.Can open a PR if people want :) [Edit: Just opened a PR! Apologies my C is very rusty! <a href="https://github.com/trholding/llama2.c/pull/14">https://github.com/trholding/llama2.c/pull/14</a>]<a href="https://github.com/trholding/llama2.c/blob/master/runq.c#L657">https://github.com/trholding/llama2.c/blob/master/runq.c#L65...</a> needs to be scaled with some weird formula like in <a href="https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L1116">https://github.com/unslothai/unsloth/blob/main/unsloth/model...</a>

评论 #41054652 未加载

AMICABoard10 个月前

Okay but hold your horses. Still a bit buggy.Sample output:Meta's Llama 3.1 models can output multilingual text which is awesome. Here are some examples output of 8 bit quantized 8b model with 100 token output (-n 100)... Quantization creates some brain damage.EnglishMy cat is funny. "Funny cat," I say, walking up to it. "What are you up to?" It sits up straight and looks at me with a tilted head, as if to say, "What's wrong with you?" Sometimes I just have to laugh at how funny a cat can be. So I say, "Okay, you're funny. I'll give you some treats." It stretches out a little and I give it some treats. It eats them up quickly and startsGermanBesitzen Sie einen Amiga 500? Wenn nicht, werden Sie wissen, dass dies ein Computer war, der im späten 1980er und frühen 1990er Jahren für Spiele verfügbar war, die für Personen mit bestimmten Körperverletzungen gedacht waren. Manchmal müssen wir uns an frühere Zeiten erinnern, die, wie wir jetzt wissen, schwierig waren. Hier ist ein Link, der meine Geschichte bespreFrenchLe vin français est, à bien des égards, un vin des origines, car il a joué un rôle important dans l'histoire de la France". La réputation des vins de France repose principalement sur leurs qualités gustatives et la gestion des vignobles contrôlée, ce qui rend le vin français un "produit d'exception". La France est donc leader mondial de la production de vin, avec 25 % des exportations monThaiแมวของฉันตลกชอบเล่นบนม้วนกระดาษ และฉันก็ไม่แน่ใจว่าควรจะยินยอมที่จะให้เล่นหรือไม่เมื่อเวลาผ่านไป ฉันเห็นว่าแมวของฉันเล่นม้วนกระดาษเป็นระยะ ๆ ฉันจึงตัดสินใจที่จะลองปรับเปลี่ยนเกมให้สนุกขึ้นHindiमेरी बिल्ली बहुत मज़ाया है और वह हमेशा अपनी शारीरिक गतिविधियों से मुझे मजाक करती है। वास्तव में, जब वह अपनी खिलौनों की चपपेट में आती है तो वह विशेष रूप से क्लासिक बन जाती है। इसके अलावा, वह एक छोटी सी च

评论 #41056479 未加载

评论 #41054454 未加载

评论 #41065256 未加载

评论 #41054172 未加载

robertkoss10 个月前

As someone who has literally no idea of ModelOps / GenAI Deployment, what am I seeing there? Code that just loads in the weights and provides an inference API? Or what does this code actually do?

评论 #41058893 未加载

v3ss0n10 个月前

How this compares to llamacpp?

评论 #41054694 未加载

hislaziness10 个月前

Cool. I will try it out. I tried the same with ollama, the non english part needs a lot more polish. Do you see the outcome being any different?

评论 #41054687 未加载

BaculumMeumEst10 个月前

I think generalizing llama2.c like this project is doing kind of defeats the purpose, no?

评论 #41057477 未加载

6 条评论

danielhanchen10 个月前

评论 #41054652 未加载

AMICABoard10 个月前

评论 #41056479 未加载

评论 #41054454 未加载

评论 #41065256 未加载

评论 #41054172 未加载

robertkoss10 个月前

As someone who has literally no idea of ModelOps / GenAI Deployment, what am I seeing there? Code that just loads in the weights and provides an inference API? Or what does this code actually do?

评论 #41058893 未加载

v3ss0n10 个月前

How this compares to llamacpp?

评论 #41054694 未加载

hislaziness10 个月前

Cool. I will try it out. I tried the same with ollama, the non english part needs a lot more polish. Do you see the outcome being any different?

评论 #41054687 未加载

BaculumMeumEst10 个月前

I think generalizing llama2.c like this project is doing kind of defeats the purpose, no?

评论 #41057477 未加载

Llama 3.1 in C

6 条评论

Llama 3.1 in C

6 条评论