TechEcho

imjonse8 months ago

Apart from results on benchmarks, what sets Allenai models apart - Olmo/OlMoE/Molmo - is they are fully open, not just open-weights/free to use. The datasets used, a crucial ingredient, are also disclosed and open. UPDATE: they say the datasets will be made available, but they aren't yet.

评论 #41650718 未加载

espadrine8 months ago

The paper: <a href="https://molmo.allenai.org/paper.pdf" rel="nofollow">https://molmo.allenai.org/paper.pdf</a>> Our key innovation is a simple but effective data collection strategy that avoids these problems: we ask annotators to describe images in speechI see this as another example that datasets trump architecture nowadays.The architecture is not where the innovation is: it is only CLIP embeddings converted to the LLM tokens through MLP with some pooling to reduce the token count.

评论 #41649490 未加载

causal8 months ago

That graphic comparing benchmark averages is really nice, wish things were presented so clearly more often.That being said, I think this definitely tilts things in Molmo's favor by including so many benchmarks that seem to favor Molmo, in particular the counting ones. The average hides that it has a pretty modest MMLU score compared to state of the art.

danielcampos938 months ago

Not mentioned in their blog posts but on the model cards on huggingface: "Molmo 72B is based on Qwen2-72B and uses OpenAI CLIP as vision backbone. Molmo-72B achieves the highest academic benchmark score and ranks second on human evaluation, just slightly behind GPT-4o." Others are based on Qwen 7B. What happened to the Olmo chain?

评论 #41649781 未加载

naiv8 months ago

image was flagged as inappropriate by the google vision api ?

评论 #41650734 未加载

Molmo: a family of open multimodal AI models

5 comments

Molmo: a family of open multimodal AI models

5 comments