TechEcho

8 comments

I wonder why they grouped languages from the Middle East and South Asia together. Arabic and Hebrew are Semitic languages - no language from that family tree is native to the subcontinent. It would make sense if northern languages like Hindi, Urdu, Bengali, Nepali, etc were grouped with Persian, French, Russian, etc since those are all from the Indo-European family. South Indian languages like Telugu and Tamil are from a completely different family (Dravidian).Why not either train the model exclusively on Semitic languages for further performance for those languages or on a wider set of languages for better multilingual performance overall? I don't understand the logic here.

评论 #43081907 未加载

评论 #43081218 未加载

评论 #43088783 未加载

评论 #43083350 未加载

评论 #43081762 未加载

Cyph0n3 months ago

Context on the name: <a href="https://en.wikipedia.org/wiki/Sheba" rel="nofollow">https://en.wikipedia.org/wiki/Sheba</a>

评论 #43079719 未加载

评论 #43080530 未加载

yodon3 months ago

> Mistral Saba is a 24B parameter model trained on meticulously curated datasets from across the Middle East and South Asia.

hazrmard3 months ago

It's great to see proliferation of models in other languages!Shoutout to Alif, a finetune of Llama 3 8b on Urdu datasets: <a href="https://huggingface.co/large-traversaal/Alif-1.0-8B-Instruct" rel="nofollow">https://huggingface.co/large-traversaal/Alif-1.0-8B-Instruct</a>It'd be great to see a comparison.

elashri3 months ago

That's interesting. It would be interesting to compare how this will fare against Fanar (Arabic oriented models) [1]. I got access to their API last week but still didn't play with it. I think they did pretty good job with arabic dialects [2]. I don't know if they have any plans to release weights though. There are two models one trained from scratch and the other ia fine turned of Google's Gemma.Saba vs fanar. I like the names too.[1] <a href="https://fanar.qa/en" rel="nofollow">https://fanar.qa/en</a>[2] <a href="https://arxiv.org/abs/2501.13944" rel="nofollow">https://arxiv.org/abs/2501.13944</a>

diggan3 months ago

Considering they don't talk about licensing, one can assume this is proprietary?~2 years ago (Sep 27, 2023), Mistral AI said:> we believe that an open approach to generative AI is necessary. Community-backed model development is the surest path to fight censorship and bias in a technology shaping our future. We strongly believe that by training our own models, releasing them openly, and fostering community contributions, we can build a credible alternative to the emerging AI oligopoly. Open-weight generative models will play a pivotal role in the upcoming AI revolution.> Mistral AI’s mission is to spearhead the revolution of open models.<a href="https://mistral.ai/en/news/about-mistral-ai" rel="nofollow">https://mistral.ai/en/news/about-mistral-ai</a>Did something change since then, or why did they have a change of hearts? Are they just doing a "OpenAI" and appear to believe in something in order to further their own cause, or does it have some particular reason behind it?

评论 #43081292 未加载

throwaway6386373 months ago

It says south asia but the blog post is about Arabic. Where are the numbers on Urdu?

Terretta3 months ago

GPT-4o mini keeps quietly demonstrating value per cost.

评论 #43081772 未加载