Mistral's process for releasing new models is <i>extremely</i> low-information. After getting very confused by this link I tried looking for a link that has <i>any</i> better information, and there just isn't one.<p>I thought Mixtral's release was weird when they just pasted a magnet link [0] into Twitter with no information, but at least people could download and analyze it so we got some reasonable third-party commentary in between that and the official announcement. With this one there's nothing at all to go on besides the name and the black box.<p>[0] <a href="https://news.ycombinator.com/item?id=38570537">https://news.ycombinator.com/item?id=38570537</a>
For those unfamiliar with the LMSys interface:<p>Click/tap on "Direct Chat" in the top tab navigation and you can select "mistral-next" as model.
AIExplained on youtube has guessed that Gemini 1.5 pro is taking Mistral’s accurate long content retrieval and Google just scaled it as much as they could. The Gemini 1.5 pro paper has a citation back to the last mistral paper in 2024.
Note that it's actually "Mistral Next" not "Mixtral Next" - so it isn't necessarily a MoE. For example, an early version of Mistral Medium (Miqu) was not a MoE but instead a Llama 70B model. I wonder how many parameters this one has
Slightly related question: what's a good coding LLM to run on a 4070 12GB card?<p>Also, do coding LLMs use treesitter to "understand" code?
This was linked randomly on Mistrals Discord chat, nothing "official" yet.<p>It's a preview of their newest prototype model.<p>To use it, click "Direct Chat" tab and choose "Mistral next"
I used this but, upon asking which model it is, it replied as being a "fine-tuned version of GPT 3.5". Any clue why? In a second chat it replied "You're chatting with one of the fine-tuned versions of the OpenAssistant model!".