TechEcho

cantSpellSober7 months ago

> outperforms Pixtral-12B and Llama3.2-11B<p>Cool, maybe needs of a better name for SEO though. ARIA has meaning in web apps.

评论 #41810925 未加载

theanonymousone7 months ago

In an MoE model such as this, are all "parts" loaded in Memory at the same time, or at any given time only one part is loaded? For example, does Mixtral-8x7B have the memory requirement of a 7B model, or a 56B model?

评论 #41809622 未加载

评论 #41808289 未加载

niutech7 months ago

I’m curious how it compares with recently announced Molmo: <a href="https://molmo.org/" rel="nofollow">https://molmo.org/</a>

评论 #41810992 未加载

评论 #41807990 未加载

petemir7 months ago

Model should be available for testing here [0], although I tried to upload a video and got an error in Chinese, and whenever I write something it says that the API key is invalid or missing.<p>[0] <a href="https://rhymes.ai/" rel="nofollow">https://rhymes.ai/</a>

vessenes7 months ago

This looks worth a try. Great test results, very good example output. No way to know if it’s cherry picked / overtuned without giving it a spin, but it will go on my list. Should fit on an M2 Max at full precision.

评论 #41806293 未加载

评论 #41805848 未加载

SomewhatLikely7 months ago

<i>"Here, we provide a quantifiable definition: A multimodal native model refers to a single model with strong understanding capabilities across multiple input modalities (e.g. text, code, image, video), that matches or exceeds the modality specialized models of similar capacities."</i>

ARIA: An Open Multimodal Native Mixture-of-Experts Model

6 comments

ARIA: An Open Multimodal Native Mixture-of-Experts Model

6 comments