TechEcho

8 comments

buran778 months ago

The "Mistral Pixtral multimodal model" really rolls off the tongue.> It’s unclear which image data Mistral might have used to develop Pixtral 12B.The days of free web scraping especially for the richer sources of material are almost gone, with anything between technical (API restrictions) and legal (copyright) measures building deep moats. I also wonder what they trained it on. They're not Meta or Google with endless supplies of user content, or exclusive contracts with the Reddits of the internet.

评论 #41515883 未加载

评论 #41516046 未加载

评论 #41516231 未加载

评论 #41517145 未加载

评论 #41516703 未加载

reissbaker8 months ago

Couple notes for newcomers:1. This is a VLM, not a text-to-image model. You can give it images, and it can understand them. It doesn't generate images back.2. It seems like Pixtral 12B benchmarks significantly below Qwen2-VL-7B [1], so if you want the best local model for understanding images, probably use Qwen2. If you want a large open-source model, Qwen2-VL-72B is most likely the best option.1: <a href="https://qwenlm.github.io/blog/qwen2-vl/" rel="nofollow">https://qwenlm.github.io/blog/qwen2-vl/</a>

评论 #41517444 未加载

aucisson_masque8 months ago

Mistral being more open than 'openai' is kind of a meme. How can a company call itself open while it refuses to openly distribute it's product and when competitor are actually doing it.

评论 #41517976 未加载

评论 #41516577 未加载

ChrisArchitect8 months ago

Related earlier:New Mistral AI Weights<a href="https://news.ycombinator.com/item?id=41508695">https://news.ycombinator.com/item?id=41508695</a>

azinman28 months ago

I’d love to know how much money Mistral is taking in versus spending. I’m very happy for all these open weights models, but they don’t have Instagram to help pay for it. These models are expensive to build.

评论 #41515346 未加载

wruza8 months ago

A question for sd lora trainers, is this usable for making captions and what are you using, apart from BLIP?Also, can your model of choice understand your requests to include/omit particular nuances of an image?

评论 #41525836 未加载

评论 #41521806 未加载

Flockster8 months ago

Could this be used for a selfhosted handwritten text recognition instance?Like writing on an ePaper tablet, exporting the PDF and feed this into this model to extract todos from notes for example.Or what would be the SotA for this application?

评论 #41516054 未加载

评论 #41516489 未加载

评论 #41516338 未加载

edude038 months ago

12B is pretty small, so I’m doubting it’ll be anywhere close to internvl2 however mistral does great work and likely this model is still useful for on device tasks

评论 #41515862 未加载

评论 #41517154 未加载

8 comments

buran778 months ago

评论 #41515883 未加载

评论 #41516046 未加载

评论 #41516231 未加载

评论 #41517145 未加载

评论 #41516703 未加载

reissbaker8 months ago

评论 #41517444 未加载

aucisson_masque8 months ago

Mistral being more open than 'openai' is kind of a meme. How can a company call itself open while it refuses to openly distribute it's product and when competitor are actually doing it.

评论 #41517976 未加载

评论 #41516577 未加载

ChrisArchitect8 months ago

Related earlier:New Mistral AI Weights<a href="https://news.ycombinator.com/item?id=41508695">https://news.ycombinator.com/item?id=41508695</a>

azinman28 months ago

评论 #41515346 未加载

wruza8 months ago

评论 #41525836 未加载

评论 #41521806 未加载

Flockster8 months ago

评论 #41516054 未加载

评论 #41516489 未加载

评论 #41516338 未加载

edude038 months ago

12B is pretty small, so I’m doubting it’ll be anywhere close to internvl2 however mistral does great work and likely this model is still useful for on device tasks

评论 #41515862 未加载

评论 #41517154 未加载

Mistral releases Pixtral 12B, its first multimodal model

8 comments

Mistral releases Pixtral 12B, its first multimodal model

8 comments