TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Mistral AI Launches New 8x22B MOE Model

379 pointsby varunvummadiabout 1 year ago

21 comments

freeqazabout 1 year ago
What&#x27;s the easiest way to run this assuming that you have the weights and the hardware? Even if it&#x27;s offloading half of the model to RAM, what tool do you use to load this? Ollama? Llama.cpp? Or just import it with some Python library?<p>Also, what&#x27;s the best way to benchmark a model to compare it with others? Are there any tools to use off-the-shelf to do that?
评论 #39986475 未加载
评论 #39986471 未加载
评论 #39992389 未加载
评论 #40003079 未加载
评论 #39990919 未加载
SushiHippieabout 1 year ago
[dupe] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39986047">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39986047</a><p>Which has the link to the tweet instead of the profile:<p><a href="https:&#x2F;&#x2F;twitter.com&#x2F;MistralAI&#x2F;status&#x2F;1777869263778291896" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;MistralAI&#x2F;status&#x2F;1777869263778291896</a>
mlsuabout 1 year ago
8x22b. If this is as good as Mixtral 8x7b we are in for a wonderful time.
评论 #39986230 未加载
评论 #39986398 未加载
nazkaabout 1 year ago
Out of topic but are we now back at the same performance than ChatGPT 4 at the time people said it worked like magic (meaning before the nerf to make it more politically correct but making his performance crash)?
评论 #39992466 未加载
评论 #39988795 未加载
zmmmmmabout 1 year ago
A pre-Llama3 race for everyone to get their best small models on the table?
评论 #39989403 未加载
评论 #39986572 未加载
nen-nomadabout 1 year ago
Mixtral 8x7b has been good to work with, and I am looking forward to trying this one as well.
ZeljkoSabout 1 year ago
Here is the unofficial benchmark: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;mistral-community&#x2F;Mixtral-8x22B-v0.1&#x2F;discussions&#x2F;4" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;mistral-community&#x2F;Mixtral-8x22B-v0.1&#x2F;...</a>
评论 #40003102 未加载
deoxykevabout 1 year ago
4 bit quants should require 85GB VRAM, so this will fit nicely on 4x 24G consumer GPUs, plus some leftover for KV cache optimization.
评论 #39988989 未加载
评论 #39986810 未加载
zone411about 1 year ago
Very important to note that this is a base model, not an instruct model. Instruct fine-tuned models are what&#x27;s useful for chat.
评论 #40004377 未加载
talsperreabout 1 year ago
Right on time as LLama 3 is released.
评论 #39988508 未加载
abdullahkhalidsabout 1 year ago
Why are some of their models open, and others closed? What is their strategy?
评论 #39987419 未加载
评论 #39986606 未加载
评论 #39986362 未加载
评论 #39986386 未加载
wkat4242about 1 year ago
Weird, the last post I see at that link is from the 8th of December 2023 and it&#x27;s not about this.<p>Edit: Ah, it&#x27;s the wrong link. <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39986047">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39986047</a><p>Thanks SushiHippie!
intellectronicaabout 1 year ago
It&#x27;s weird that more than a day after the weights dropped, there still isn&#x27;t a proper announcement from Mistral with a model card. Nor is it available on Mistral&#x27;s own platform.
评论 #40001693 未加载
ein0pabout 1 year ago
To this day 8x7b Mixtral remains the best model you can run on a single 48GB GPU. This has the potential to become the best model you can run on two such GPUs, or on an MBP with maxed out RAM, when 4-bit quantized.
评论 #39986967 未加载
评论 #39997850 未加载
评论 #39987900 未加载
评论 #39986509 未加载
varunvummadiabout 1 year ago
They Just announced their new model on Twitter, which you can download using torrent
评论 #39997185 未加载
aurareturnabout 1 year ago
Might be a dumb question but does this mean this model has 176B params?
评论 #39988103 未加载
评论 #39987939 未加载
resource_wasteabout 1 year ago
What is the excitement around models that arent as good as llama?<p>This is clearly an inferior model that they are willing to share for marketing purposes.<p>If it was an improvement over llama, sure, but it seems like just an ad for bad AI.
评论 #40005904 未加载
评论 #40004219 未加载
评论 #40003879 未加载
评论 #40003998 未加载
swalshabout 1 year ago
Is this Mistral large?
评论 #39987431 未加载
评论 #39986116 未加载
stainablesteelabout 1 year ago
has anyone had success making an auto-gpt concept for mistral&#x2F;llama models? i haven&#x27;t found one
评论 #40003628 未加载
angillyabout 1 year ago
The lack of a corresponding announcement on their blog makes me worry about a Twitter account compromise and a malicious model. Any way to verify it’s really from them?
评论 #39986628 未加载
评论 #39986357 未加载
评论 #39986397 未加载
tjtang2019about 1 year ago
What are the advantages compared to GPT? Looking forward to using it!
评论 #40004260 未加载