TechEcho

15 comments

jjrvvor 4 Tagen

I found a losslessly compressed version: <a href="https://github.com/LeanModels/Bagel-DFloat11">https://github.com/LeanModels/Bagel-DFloat11</a>It works following readme instructions at least on Ubuntu, on my RTX 3090 GPU with 24 gigs of memory, just barely. Have to close most other windows and lower screen resolution to be able to load the model. Then it generates or edits images in 2-3 minutes. I only have this one GPU and am using Chrome to use the browser interface on the same machine.The original release won't run on this hardware, but the compressed one is supposed to give identical results.

评论 #44097442 未加载

spuzvor 4 Tagen

I'm interested in potential alternatives to ChatGPT's advanced voice mode. When I see the word "multimodal" I'm hopeful the model understands text + voice but instead it almost always seems to refer to text + images. Is there a keyword that I can use to look for models that work with voice similar to ChatGPT's advanced voice mode?

评论 #44099377 未加载

评论 #44096014 未加载

akacrobatvor 4 Tagen

This looks exciting! There is a serious dearth of high-quality open-source models with multimodal capabilities. So, really looking forward to playing with this one.Has anyone here experimented with fine-tuning this for domain-specific applications?

charcircuitvor 4 Tagen

The demo shows pretty weak performance compared to other small models. It misunderstood my question due to picking an uncommon way to interpret it. After clarifying what I wanted it lost all context I had provided in the previous message. My benchmark query intentionally ambiguous and I use it to see how models handle ambiguity, handle information which can be outdated, and handle avoiding hallucination. Usually weak models will just hallucinate an answer, but this model was the first who want able to understand the question.

LourensTvor 4 Tagen

These days, papers come with an advertisement video

评论 #44095165 未加载

评论 #44099559 未加载

pleonevor 4 Tagen

Is it from ByteDance Team, right? The team behind TikTok, CapCut, BuzzVideo and more. Any thoughts on that?

评论 #44096455 未加载

akoculuvor 4 Tagen

Good summary of the paper: <a href="https://x.com/build__ship/status/1926930191185580176" rel="nofollow">https://x.com/build__ship/status/1926930191185580176</a>

mdrznvor 4 Tagen

A quick test in the "demo" link doesn't show it to be "as smart" as it appeared in the demos on the page. I really hope it does all it's promising to do, but I'm skeptic so far.

评论 #44099091 未加载

moffkalastvor 4 Tagen

Oh no it's The Everything Bagel.

mnky9800nvor 4 Tagen

I couldn’t find it, what are the hardware expectations for bagel?

评论 #44094936 未加载

评论 #44094927 未加载

GrantMoyervor 4 Tagen

Nice, it's really an open source model, Apache 2.0.

wsintra2022vor 4 Tagen

<a href="https://news.ycombinator.com/item?id=44063602">https://news.ycombinator.com/item?id=44063602</a>

sandra_vuvor 4 Tagen

Hi good job, team. Any plans to commercialize the model?

saretupvor 4 Tagen

> Scalable Perceptual Generative ModelIf you wanna call it Bagel, just call it Bagel. No need to make up a justification.

gregjwvor 4 Tagen

bagel

15 comments

jjrvvor 4 Tagen

评论 #44097442 未加载

spuzvor 4 Tagen

评论 #44099377 未加载

评论 #44096014 未加载

akacrobatvor 4 Tagen

charcircuitvor 4 Tagen

LourensTvor 4 Tagen

These days, papers come with an advertisement video

评论 #44095165 未加载

评论 #44099559 未加载

pleonevor 4 Tagen

Is it from ByteDance Team, right? The team behind TikTok, CapCut, BuzzVideo and more. Any thoughts on that?

评论 #44096455 未加载

akoculuvor 4 Tagen

Good summary of the paper: <a href="https://x.com/build__ship/status/1926930191185580176" rel="nofollow">https://x.com/build__ship/status/1926930191185580176</a>

mdrznvor 4 Tagen

A quick test in the "demo" link doesn't show it to be "as smart" as it appeared in the demos on the page. I really hope it does all it's promising to do, but I'm skeptic so far.

评论 #44099091 未加载

moffkalastvor 4 Tagen

Oh no it's The Everything Bagel.

mnky9800nvor 4 Tagen

I couldn’t find it, what are the hardware expectations for bagel?

评论 #44094936 未加载

评论 #44094927 未加载

GrantMoyervor 4 Tagen

Nice, it's really an open source model, Apache 2.0.

wsintra2022vor 4 Tagen

<a href="https://news.ycombinator.com/item?id=44063602">https://news.ycombinator.com/item?id=44063602</a>

sandra_vuvor 4 Tagen

Hi good job, team. Any plans to commercialize the model?

saretupvor 4 Tagen

> Scalable Perceptual Generative ModelIf you wanna call it Bagel, just call it Bagel. No need to make up a justification.

gregjwvor 4 Tagen

bagel

Bagel: Open-source unified multimodal model

15 comments

Bagel: Open-source unified multimodal model

15 comments