Llama-3.3-70B-Instruct

425 点作者 pr337h4m5 个月前

26 条评论

paxys5 个月前

Benchmarks - <a href="https://www.reddit.com/r/LocalLLaMA/comments/1h85ld5/comment/m0qauyg/" rel="nofollow">https://www.reddit.com/r/LocalLLaMA/comments/1h85ld5/comment...</a>Seems to perform on par with or slightly better than Llama 3.2 405B, which is crazy impressive.Edit: According to Zuck (<a href="https://www.instagram.com/p/DDPm9gqv2cW/" rel="nofollow">https://www.instagram.com/p/DDPm9gqv2cW/</a>) this is the last release in the Llama 3 series, and we'll see Llama 4 in 2025. Hype!!

评论 #42343012 未加载

评论 #42341998 未加载

评论 #42342566 未加载

ben305 个月前

This reminds me of Steve Jobs's famous comment to Dropbox about storage being 'a feature, not a product.' Zuckerberg - by open-sourcing these powerful models, he's effectively commoditising AI while Meta's real business model remains centred around their social platforms. They can leverage these models to enhance Facebook and Instagram's services while simultaneously benefiting from the community improvements and attention. It's not about selling AI; it's about using AI to strengthen their core business. By making it open, they get the benefits of widespread adoption and development without needing to monetise the models directly.

评论 #42343031 未加载

评论 #42343254 未加载

评论 #42344139 未加载

评论 #42343397 未加载

评论 #42346371 未加载

评论 #42343585 未加载

评论 #42343303 未加载

评论 #42344013 未加载

LorenDB5 个月前

Seems to be more or less on par with GPT-4o across many benchmarks: <a href="https://x.com/Ahmad_Al_Dahle/status/1865071436630778109" rel="nofollow">https://x.com/Ahmad_Al_Dahle/status/1865071436630778109</a>

评论 #42342258 未加载

评论 #42344363 未加载

评论 #42342356 未加载

评论 #42342588 未加载

freediver5 个月前

Does unexpectedly well on our benchmark:<a href="https://help.kagi.com/kagi/ai/llm-benchmark.html" rel="nofollow">https://help.kagi.com/kagi/ai/llm-benchmark.html</a>Will dive into it more, but this is impressive.

评论 #42346793 未加载

profsummergig5 个月前

Please help me understand something.I've been out of the loop with HuggingFace models.What can you do with these models?1. Can you download them and run them on your Laptop via JupyterLab?2. What benefits does that get you?3. Can you update them regularly (with new data on the internet, e.g.)?4. Can you finetune them for a specific use case (e.g. GeoSpatial data)?5. How difficult and time-consuming (person-hours) is it to finetune a model?(If HuggingFace has answers to these questions, please point me to the URL. HuggingFace, to me, seems like the early days of GitHub. A small number were heavy users, but the rest were left scratching their heads and wondering how to use it.)Granted it's a newbie question, but answers will be beneficial to a lot of us out there.

评论 #42342569 未加载

评论 #42342540 未加载

评论 #42343625 未加载

theanonymousone5 个月前

I'm "tracking" the price of if 1M tokens in OpenRouter and it is decreasing every few refreshes. It's funny: <a href="https://openrouter.ai/meta-llama/llama-3.3-70b-instruct" rel="nofollow">https://openrouter.ai/meta-llama/llama-3.3-70b-instruct</a>

danielhanchen5 个月前

I uploaded 4bit bitsandbytes, GGUFs and original 16bit weights to <a href="https://huggingface.co/unsloth" rel="nofollow">https://huggingface.co/unsloth</a> for those interested! You can also finetune Llama 3.3 70B in under 48GB of VRAM and 2x faster and use 70% less memory with Unsloth!

bnchrch5 个月前

Open Sourcing Llama is one of the best example and roll out of "Commoditize Your Complement" in memory.Link to Gwern's "Laws of Tech: Commoditize Your Complement" for those who havent heard of this strategy before<a href="https://gwern.net/complement" rel="nofollow">https://gwern.net/complement</a>

评论 #42345482 未加载

hubraumhugo5 个月前

Meta continues to overdeliver. Their goal from the start was to target and disrupt OpenAI/Anthropic with a scorched earth approach by releasing powerful open models.The big winners: we developers.

philipkiely5 个月前

Just spent a few minutes this morning spinning up a H100 model server and trying an FP8 quantized version (including kv cache quantization) to fit it on 2 H100s -- speed and quality looking promising.I'm excited to see if the better instruction following benchmarks improves function calling / agentic capabilities.

kstrauser5 个月前

I know this has been discussed before but it changes frequently: what’s the good “generic” Mac desktop client these days? I’d like to use Ollama and/or ChatGPT. Maybe Claude. Perhaps Perplexity, too. I primarily want to use AI chats in various apps, like typing “write a function to…” into whatever random editor I’m using at the moment. It doesn’t have to be a desktop app, either. If there’s a great PopClip plugin or Keyboard Maestro macro, or even something that works as a system service, that’s perfectly fine by me.MacMind is nifty, but that feels like a lot of money for something that’s a front end to someone else’s API. “Stop being a cheapskate” is a legitimate answer.

评论 #42347082 未加载

评论 #42346030 未加载

评论 #42348397 未加载

hrpnk5 个月前

Seems that a bunch of quantized models are already uploaded to ollama: <a href="https://ollama.com/library/llama3.3/tags">https://ollama.com/library/llama3.3/tags</a>

adt5 个月前

Model card: <a href="https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md">https://github.com/meta-llama/llama-models/blob/main/models/...</a>On the Models Table: <a href="https://lifearchitect.ai/models-table/" rel="nofollow">https://lifearchitect.ai/models-table/</a>

LorenDB5 个月前

Hopefully this lands on Groq soon!

评论 #42342621 未加载

评论 #42342064 未加载

theanonymousone5 个月前

Given the comments saying it's performance seems comparable to 4o/4o-mini, is it safe to say that GPT-4 performance can be achieved with less than 100B parameters,in contrary to what previously was thought?

andy_ppp5 个月前

How many tokens per second can I get on an M4 Max with 128gb of RAM?

评论 #42346153 未加载

评论 #42346025 未加载

ndr_5 个月前

It's available on IBM WatsonX, but the Prompt Lab may still report "model unavailable". This is because of overeager guardrails. These can be turned off, but the German translation for this option is broken too: look for "KI-Guardrails auf" in the upper right.

henry20235 个月前

I'm building a PC just to run inference on this and the QwQ 32B models.Any suggestions on RAM and GPU I should get?

评论 #42344167 未加载

评论 #42343184 未加载

评论 #42342616 未加载

评论 #42352812 未加载

评论 #42342636 未加载

jadbox5 个月前

Would anyone be willing to compress this down to maybe 14b-20b for us on peasant 16gb rigs?

评论 #42342404 未加载

kordlessagain5 个月前

Summary of discussion: <a href="https://claude.site/artifacts/635d6816-9f60-4545-aeed-54ba180cfd5e" rel="nofollow">https://claude.site/artifacts/635d6816-9f60-4545-aeed-54ba18...</a>

nxobject5 个月前

I'm surprised that, out of all of the East Asian languages, they chose Thai to support: do they have a big office there? (I imagine compared to, say, Japanese or (some form of) Mandarin?)

knighthack5 个月前

Given how censored the 3.2 model was, is I'm looking forward to the abliterated 3.3 version to see if there's any significant improvements there that can replace it.

antirez5 个月前

Hot take after trying it a bit. I was not impressed with llama 3.2, but this one, well, it looks like we finally have a very very strong free LLM.

Narciss5 个月前

This is massive, really cool of meta to open source it

评论 #42342354 未加载

ppp9995 个月前

We need more uncensored models

ulam25 个月前

No base model? disappointed.

评论 #42342242 未加载

评论 #42342175 未加载

评论 #42342112 未加载

26 条评论

paxys5 个月前

评论 #42343012 未加载

评论 #42341998 未加载

评论 #42342566 未加载

ben305 个月前

评论 #42343031 未加载

评论 #42343254 未加载

评论 #42344139 未加载

评论 #42343397 未加载

评论 #42346371 未加载

评论 #42343585 未加载

评论 #42343303 未加载

评论 #42344013 未加载

LorenDB5 个月前

评论 #42342258 未加载

评论 #42344363 未加载

评论 #42342356 未加载

评论 #42342588 未加载

freediver5 个月前

评论 #42346793 未加载

profsummergig5 个月前

评论 #42342569 未加载

评论 #42342540 未加载

评论 #42343625 未加载

theanonymousone5 个月前

danielhanchen5 个月前

bnchrch5 个月前

评论 #42345482 未加载

hubraumhugo5 个月前

Meta continues to overdeliver. Their goal from the start was to target and disrupt OpenAI/Anthropic with a scorched earth approach by releasing powerful open models.The big winners: we developers.

philipkiely5 个月前

kstrauser5 个月前

评论 #42347082 未加载

评论 #42346030 未加载

评论 #42348397 未加载

hrpnk5 个月前

Seems that a bunch of quantized models are already uploaded to ollama: <a href="https://ollama.com/library/llama3.3/tags">https://ollama.com/library/llama3.3/tags</a>

adt5 个月前

LorenDB5 个月前

Hopefully this lands on Groq soon!

评论 #42342621 未加载

评论 #42342064 未加载

theanonymousone5 个月前

andy_ppp5 个月前

How many tokens per second can I get on an M4 Max with 128gb of RAM?

评论 #42346153 未加载

评论 #42346025 未加载

ndr_5 个月前

henry20235 个月前

I'm building a PC just to run inference on this and the QwQ 32B models.Any suggestions on RAM and GPU I should get?

评论 #42344167 未加载

评论 #42343184 未加载

评论 #42342616 未加载

评论 #42352812 未加载

评论 #42342636 未加载

jadbox5 个月前

Would anyone be willing to compress this down to maybe 14b-20b for us on peasant 16gb rigs?

评论 #42342404 未加载

kordlessagain5 个月前

Summary of discussion: <a href="https://claude.site/artifacts/635d6816-9f60-4545-aeed-54ba180cfd5e" rel="nofollow">https://claude.site/artifacts/635d6816-9f60-4545-aeed-54ba18...</a>

nxobject5 个月前

I'm surprised that, out of all of the East Asian languages, they chose Thai to support: do they have a big office there? (I imagine compared to, say, Japanese or (some form of) Mandarin?)

knighthack5 个月前

Given how censored the 3.2 model was, is I'm looking forward to the abliterated 3.3 version to see if there's any significant improvements there that can replace it.

antirez5 个月前

Hot take after trying it a bit. I was not impressed with llama 3.2, but this one, well, it looks like we finally have a very very strong free LLM.