TechEcho

21 comments

freeqazabout 1 year ago

What's the easiest way to run this assuming that you have the weights and the hardware? Even if it's offloading half of the model to RAM, what tool do you use to load this? Ollama? Llama.cpp? Or just import it with some Python library?Also, what's the best way to benchmark a model to compare it with others? Are there any tools to use off-the-shelf to do that?

评论 #39986475 未加载

评论 #39986471 未加载

评论 #39992389 未加载

评论 #40003079 未加载

评论 #39990919 未加载

SushiHippieabout 1 year ago

[dupe] <a href="https://news.ycombinator.com/item?id=39986047">https://news.ycombinator.com/item?id=39986047</a>Which has the link to the tweet instead of the profile:<a href="https://twitter.com/MistralAI/status/1777869263778291896" rel="nofollow">https://twitter.com/MistralAI/status/1777869263778291896</a>

mlsuabout 1 year ago

8x22b. If this is as good as Mixtral 8x7b we are in for a wonderful time.

评论 #39986230 未加载

评论 #39986398 未加载

nazkaabout 1 year ago

Out of topic but are we now back at the same performance than ChatGPT 4 at the time people said it worked like magic (meaning before the nerf to make it more politically correct but making his performance crash)?

评论 #39992466 未加载

评论 #39988795 未加载

zmmmmmabout 1 year ago

A pre-Llama3 race for everyone to get their best small models on the table?

评论 #39989403 未加载

评论 #39986572 未加载

nen-nomadabout 1 year ago

Mixtral 8x7b has been good to work with, and I am looking forward to trying this one as well.

ZeljkoSabout 1 year ago

Here is the unofficial benchmark: <a href="https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/discussions/4" rel="nofollow">https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/...</a>

评论 #40003102 未加载

deoxykevabout 1 year ago

4 bit quants should require 85GB VRAM, so this will fit nicely on 4x 24G consumer GPUs, plus some leftover for KV cache optimization.

评论 #39988989 未加载

评论 #39986810 未加载

zone411about 1 year ago

Very important to note that this is a base model, not an instruct model. Instruct fine-tuned models are what's useful for chat.

评论 #40004377 未加载

talsperreabout 1 year ago

Right on time as LLama 3 is released.

评论 #39988508 未加载

abdullahkhalidsabout 1 year ago

Why are some of their models open, and others closed? What is their strategy?

评论 #39987419 未加载

评论 #39986606 未加载

评论 #39986362 未加载

评论 #39986386 未加载

wkat4242about 1 year ago

Weird, the last post I see at that link is from the 8th of December 2023 and it's not about this.Edit: Ah, it's the wrong link. <a href="https://news.ycombinator.com/item?id=39986047">https://news.ycombinator.com/item?id=39986047</a>Thanks SushiHippie!

intellectronicaabout 1 year ago

It's weird that more than a day after the weights dropped, there still isn't a proper announcement from Mistral with a model card. Nor is it available on Mistral's own platform.

评论 #40001693 未加载

ein0pabout 1 year ago

To this day 8x7b Mixtral remains the best model you can run on a single 48GB GPU. This has the potential to become the best model you can run on two such GPUs, or on an MBP with maxed out RAM, when 4-bit quantized.

评论 #39986967 未加载

评论 #39997850 未加载

评论 #39987900 未加载

评论 #39986509 未加载

varunvummadiabout 1 year ago

They Just announced their new model on Twitter, which you can download using torrent

评论 #39997185 未加载

aurareturnabout 1 year ago

Might be a dumb question but does this mean this model has 176B params?

评论 #39988103 未加载

评论 #39987939 未加载

resource_wasteabout 1 year ago

What is the excitement around models that arent as good as llama?This is clearly an inferior model that they are willing to share for marketing purposes.If it was an improvement over llama, sure, but it seems like just an ad for bad AI.

评论 #40005904 未加载

评论 #40004219 未加载

评论 #40003879 未加载

评论 #40003998 未加载

swalshabout 1 year ago

Is this Mistral large?

评论 #39987431 未加载

评论 #39986116 未加载

stainablesteelabout 1 year ago

has anyone had success making an auto-gpt concept for mistral/llama models? i haven't found one

评论 #40003628 未加载

angillyabout 1 year ago

The lack of a corresponding announcement on their blog makes me worry about a Twitter account compromise and a malicious model. Any way to verify it’s really from them?

评论 #39986628 未加载

评论 #39986357 未加载

评论 #39986397 未加载

tjtang2019about 1 year ago

What are the advantages compared to GPT? Looking forward to using it!

评论 #40004260 未加载

21 comments

freeqazabout 1 year ago

评论 #39986475 未加载

评论 #39986471 未加载

评论 #39992389 未加载

评论 #40003079 未加载

评论 #39990919 未加载

SushiHippieabout 1 year ago

mlsuabout 1 year ago

8x22b. If this is as good as Mixtral 8x7b we are in for a wonderful time.

评论 #39986230 未加载

评论 #39986398 未加载

nazkaabout 1 year ago

评论 #39992466 未加载

评论 #39988795 未加载

zmmmmmabout 1 year ago

A pre-Llama3 race for everyone to get their best small models on the table?

评论 #39989403 未加载

评论 #39986572 未加载

nen-nomadabout 1 year ago

Mixtral 8x7b has been good to work with, and I am looking forward to trying this one as well.

ZeljkoSabout 1 year ago

评论 #40003102 未加载

deoxykevabout 1 year ago

4 bit quants should require 85GB VRAM, so this will fit nicely on 4x 24G consumer GPUs, plus some leftover for KV cache optimization.

评论 #39988989 未加载

评论 #39986810 未加载

zone411about 1 year ago

Very important to note that this is a base model, not an instruct model. Instruct fine-tuned models are what's useful for chat.

评论 #40004377 未加载

talsperreabout 1 year ago

Right on time as LLama 3 is released.

评论 #39988508 未加载

abdullahkhalidsabout 1 year ago

Why are some of their models open, and others closed? What is their strategy?

评论 #39987419 未加载

评论 #39986606 未加载

评论 #39986362 未加载

评论 #39986386 未加载

wkat4242about 1 year ago

intellectronicaabout 1 year ago

It's weird that more than a day after the weights dropped, there still isn't a proper announcement from Mistral with a model card. Nor is it available on Mistral's own platform.

评论 #40001693 未加载

ein0pabout 1 year ago

评论 #39986967 未加载

评论 #39997850 未加载

评论 #39987900 未加载

评论 #39986509 未加载

varunvummadiabout 1 year ago

They Just announced their new model on Twitter, which you can download using torrent

评论 #39997185 未加载

aurareturnabout 1 year ago

Might be a dumb question but does this mean this model has 176B params?

评论 #39988103 未加载

评论 #39987939 未加载

resource_wasteabout 1 year ago

评论 #40005904 未加载

评论 #40004219 未加载

评论 #40003879 未加载

评论 #40003998 未加载

swalshabout 1 year ago

Is this Mistral large?

评论 #39987431 未加载

评论 #39986116 未加载

stainablesteelabout 1 year ago

has anyone had success making an auto-gpt concept for mistral/llama models? i haven't found one

评论 #40003628 未加载

angillyabout 1 year ago

The lack of a corresponding announcement on their blog makes me worry about a Twitter account compromise and a malicious model. Any way to verify it’s really from them?

Mistral AI Launches New 8x22B MOE Model

21 comments

Mistral AI Launches New 8x22B MOE Model

21 comments