Xiaomi MiMo Reasoning Model

482 pointsby thm23 days ago

19 comments

Arcuru23 days ago

From the paper, I was intrigued by how they handled their RL step for Code Data. They trained against hard but solvable code generation tasks by running unit testing. Is that training step done by the other models?> Code Data For coding problems, we curate a high-quality training set comprising open-source datasets and our newly collected problem set. We remove problems without test cases. For problems with golden solutions, we exclude those where the golden solution failed to pass all test cases. For problems without golden solution, we discard problems where no test case can be solved in 16 rollouts of advanced reasoning models. Similar to math data, we utilize an SFT version of MiMo-7B to filter out easy problems that are perfectly solved in all 16 rollouts. This rigorous cleaning process yields 30K code problems.> During each RL iteration, we evaluate thousands of problems to compute the rewards, with each problem potentially containing hundreds of test cases. To improve reward computing efficiency and eliminate GPU idle time, we developed an online judge environment that enables parallel execution of extremely high-volume unit tests.

评论 #43856920 未加载

lvl15523 days ago

Why are there so many English-first AI models from China? Are they not interested in serving their own population? Or is it that if they publish Chinese-first models it won't get publicity in the West?

评论 #43848137 未加载

评论 #43844995 未加载

评论 #43849168 未加载

评论 #43845129 未加载

评论 #43847013 未加载

评论 #43845145 未加载

评论 #43852833 未加载

评论 #43847424 未加载

评论 #43845601 未加载

评论 #43848050 未加载

评论 #43846972 未加载

评论 #43848434 未加载

评论 #43846518 未加载

评论 #43849745 未加载

评论 #43847842 未加载

评论 #43845309 未加载

评论 #43845334 未加载

评论 #43845215 未加载

siliconc0w23 days ago

This is incredibly strong coding performance for a 7b. I use Gemini Pro 2.5 which got 67.8 and this got 57.8, very close to Gemini 2.5 Flash which got 60.6.I've become pretty skeptical about eval results given what we've heard about llama4 so we'll see where this lands on the closed evals but very impressive to see.

jedisct123 days ago

GGUF version (for LM Studio, Ollama, etc): <a href="https://huggingface.co/jedisct1/MiMo-7B-RL-GGUF" rel="nofollow">https://huggingface.co/jedisct1/MiMo-7B-RL-GGUF</a>

rahimnathwani23 days ago

When you guys use gguf files in ollama, do you normally create a modelfile to go with it, or just hope that whatever default ollama has work with the new model?<a href="https://github.com/ollama/ollama/blob/main/docs%2Fmodelfile.md">https://github.com/ollama/ollama/blob/main/docs%2Fmodelfile....</a>

评论 #43848298 未加载

评论 #43846325 未加载

评论 #43846362 未加载

gizmodo5923 days ago

Its funny to see benchmarks where they omit the top performing models like O3 (Which is the best model in many benchmarks currently) and Gemini Pro/Claude 3.7.

评论 #43846000 未加载

评论 #43858477 未加载

badmonster23 days ago

MiMo-7B claims to outperform larger models like Qwen-32B and match OpenAI o1-mini on math/code benchmarks — all with a 7B model trained from scratch. Is this a sign that pretraining + RLHF optimization is finally outpacing scale? Or are we just getting better at benchmarking narrow capabilities?

评论 #43856953 未加载

xpe23 days ago

The README says "RL" without specifying what kind of RL is used. Researchers: I know you are busy, and I know good writing takes time, but please don't skip this kind of detail.

评论 #43886526 未加载

评论 #43855716 未加载

Jotalea23 days ago

I wonder if they will use this model for their AI assistant on their Xiaomi 15 series phones. They most likely will. I'm not really sure what to expect from it.

ramesh3123 days ago

These benchmark numbers cannot be real for a 7b model

评论 #43843565 未加载

评论 #43843465 未加载

评论 #43844211 未加载

评论 #43843626 未加载

评论 #43844045 未加载

评论 #43845812 未加载

vessenes23 days ago

Umm wow. Great benchmarks. I’m looking forward to chatting with this one.A couple things stand out to me — first is that the 7B model is trained on 25T tokens(!). This is Meta-scale training; Llama 4 Maverick was trained on 22T or so. (Scout, the smaller model: 40T).Second, this is an interesting path to take - not a distilled model or an RL layer to get reasoning out of another model, but a from-scratch RL model with reasoning baked in; the claims seem to indicate you get a lot of extra efficiency per-parameter doing this.I don’t have experience with Xiaomi models, so I’m cautious about this one until I play with it, but it looks like a super viable local reasoning model from the stats.

Havoc23 days ago

Been testing it a bit and overall pretty solid. The lengthy think times means one waits quite a while though. Longer than much larger models like say the recent qwen moeThat moe strikes me as the better overall tradeoff

userbinator23 days ago

...and searching for things related to multiple antennae just got harder.They could've called it Xiaomimo.

评论 #43843649 未加载

mobilio23 days ago

Waiting for GGUF or MLX models.Probably within few hours will be released.

评论 #43844474 未加载

评论 #43843407 未加载

m4r1k23 days ago

My Chinese friend told me MiMo doesn’t have a meaning in Chinese (of course Mi 米 = rice). Anybody have a clue for what it stands for?

评论 #43844727 未加载

评论 #43844552 未加载

评论 #43853944 未加载

评论 #43844575 未加载

评论 #43844845 未加载

CodeCompost23 days ago

Open Source or Open Weights?

评论 #43844192 未加载

评论 #43844555 未加载

w4yai23 days ago

Anyone tried it ?

评论 #43847403 未加载

评论 #43843183 未加载

xmorse23 days ago

Xiaomi is an amazing company

sida23 days ago

Xiaomi in Chinese translates to "Little Rice"Here is the meaning of the nameDescribed here: <a href="https://finance.sina.cn/tech/2020-11-26/detail-iiznctke3397952.d.html" rel="nofollow">https://finance.sina.cn/tech/2020-11-26/detail-iiznctke33979...</a>在后来的讨论中，我突然想到了我最喜欢的一句话——“佛观一粒米，大如须弥山”。Translated into English, it means:“In the later discussions, I suddenly thought of one of my favorite sayings — ‘A Buddha sees a single grain of rice as vast as Mount Sumeru.’”This expression emphasizes the idea that even something seemingly small (like a grain of rice) can hold immense significance or value when viewed from a different perspective.Thanks to chatgpt for translating this