> Building on Mistral Small 3, this new model comes with improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. The model outperforms comparable models like Gemma 3 and GPT-4o Mini, while delivering inference speeds of 150 tokens per second.<p>This is a really nice bump on the previous model, considering it’s now multimodal. I’m a little surprised it only received a 0.1 version bump.