TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

ARIA: An Open Multimodal Native Mixture-of-Experts Model

97 pointsby jinqueeny7 months ago

6 comments

cantSpellSober7 months ago
&gt; outperforms Pixtral-12B and Llama3.2-11B<p>Cool, maybe needs of a better name for SEO though. ARIA has meaning in web apps.
评论 #41810925 未加载
theanonymousone7 months ago
In an MoE model such as this, are all &quot;parts&quot; loaded in Memory at the same time, or at any given time only one part is loaded? For example, does Mixtral-8x7B have the memory requirement of a 7B model, or a 56B model?
评论 #41809622 未加载
评论 #41808289 未加载
niutech7 months ago
I’m curious how it compares with recently announced Molmo: <a href="https:&#x2F;&#x2F;molmo.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;molmo.org&#x2F;</a>
评论 #41810992 未加载
评论 #41807990 未加载
petemir7 months ago
Model should be available for testing here [0], although I tried to upload a video and got an error in Chinese, and whenever I write something it says that the API key is invalid or missing.<p>[0] <a href="https:&#x2F;&#x2F;rhymes.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;rhymes.ai&#x2F;</a>
vessenes7 months ago
This looks worth a try. Great test results, very good example output. No way to know if it’s cherry picked &#x2F; overtuned without giving it a spin, but it will go on my list. Should fit on an M2 Max at full precision.
评论 #41806293 未加载
评论 #41805848 未加载
SomewhatLikely7 months ago
<i>&quot;Here, we provide a quantifiable definition: A multimodal native model refers to a single model with strong understanding capabilities across multiple input modalities (e.g. text, code, image, video), that matches or exceeds the modality specialized models of similar capacities.&quot;</i>