TechEcho

15 comments

jjrvil y a 4 jours

I found a losslessly compressed version: <a href="https://github.com/LeanModels/Bagel-DFloat11">https://github.com/LeanModels/Bagel-DFloat11</a>It works following readme instructions at least on Ubuntu, on my RTX 3090 GPU with 24 gigs of memory, just barely. Have to close most other windows and lower screen resolution to be able to load the model. Then it generates or edits images in 2-3 minutes. I only have this one GPU and am using Chrome to use the browser interface on the same machine.The original release won't run on this hardware, but the compressed one is supposed to give identical results.

评论 #44097442 未加载

spuzil y a 4 jours

I'm interested in potential alternatives to ChatGPT's advanced voice mode. When I see the word "multimodal" I'm hopeful the model understands text + voice but instead it almost always seems to refer to text + images. Is there a keyword that I can use to look for models that work with voice similar to ChatGPT's advanced voice mode?

评论 #44099377 未加载

评论 #44096014 未加载

akacrobatil y a 4 jours

This looks exciting! There is a serious dearth of high-quality open-source models with multimodal capabilities. So, really looking forward to playing with this one.Has anyone here experimented with fine-tuning this for domain-specific applications?

charcircuitil y a 4 jours

The demo shows pretty weak performance compared to other small models. It misunderstood my question due to picking an uncommon way to interpret it. After clarifying what I wanted it lost all context I had provided in the previous message. My benchmark query intentionally ambiguous and I use it to see how models handle ambiguity, handle information which can be outdated, and handle avoiding hallucination. Usually weak models will just hallucinate an answer, but this model was the first who want able to understand the question.

LourensTil y a 4 jours

These days, papers come with an advertisement video

评论 #44095165 未加载

评论 #44099559 未加载

pleoneil y a 4 jours

Is it from ByteDance Team, right? The team behind TikTok, CapCut, BuzzVideo and more. Any thoughts on that?

评论 #44096455 未加载

akoculuil y a 4 jours

Good summary of the paper: <a href="https://x.com/build__ship/status/1926930191185580176" rel="nofollow">https://x.com/build__ship/status/1926930191185580176</a>

mdrznil y a 4 jours

A quick test in the "demo" link doesn't show it to be "as smart" as it appeared in the demos on the page. I really hope it does all it's promising to do, but I'm skeptic so far.

评论 #44099091 未加载

moffkalastil y a 4 jours

Oh no it's The Everything Bagel.

mnky9800nil y a 4 jours

I couldn’t find it, what are the hardware expectations for bagel?

评论 #44094936 未加载

评论 #44094927 未加载

GrantMoyeril y a 4 jours

Nice, it's really an open source model, Apache 2.0.

wsintra2022il y a 4 jours

<a href="https://news.ycombinator.com/item?id=44063602">https://news.ycombinator.com/item?id=44063602</a>

sandra_vuil y a 4 jours

Hi good job, team. Any plans to commercialize the model?

saretupil y a 4 jours

> Scalable Perceptual Generative ModelIf you wanna call it Bagel, just call it Bagel. No need to make up a justification.

gregjwil y a 4 jours

bagel

15 comments

jjrvil y a 4 jours

评论 #44097442 未加载

spuzil y a 4 jours

评论 #44099377 未加载

评论 #44096014 未加载

akacrobatil y a 4 jours

charcircuitil y a 4 jours

LourensTil y a 4 jours

These days, papers come with an advertisement video

评论 #44095165 未加载

评论 #44099559 未加载

pleoneil y a 4 jours

Is it from ByteDance Team, right? The team behind TikTok, CapCut, BuzzVideo and more. Any thoughts on that?

评论 #44096455 未加载

akoculuil y a 4 jours

Good summary of the paper: <a href="https://x.com/build__ship/status/1926930191185580176" rel="nofollow">https://x.com/build__ship/status/1926930191185580176</a>

mdrznil y a 4 jours

A quick test in the "demo" link doesn't show it to be "as smart" as it appeared in the demos on the page. I really hope it does all it's promising to do, but I'm skeptic so far.

评论 #44099091 未加载

moffkalastil y a 4 jours

Oh no it's The Everything Bagel.

mnky9800nil y a 4 jours

I couldn’t find it, what are the hardware expectations for bagel?

评论 #44094936 未加载

评论 #44094927 未加载

GrantMoyeril y a 4 jours

Nice, it's really an open source model, Apache 2.0.

wsintra2022il y a 4 jours

<a href="https://news.ycombinator.com/item?id=44063602">https://news.ycombinator.com/item?id=44063602</a>

sandra_vuil y a 4 jours

Hi good job, team. Any plans to commercialize the model?

saretupil y a 4 jours

> Scalable Perceptual Generative ModelIf you wanna call it Bagel, just call it Bagel. No need to make up a justification.

gregjwil y a 4 jours

bagel

Bagel: Open-source unified multimodal model

15 comments

Bagel: Open-source unified multimodal model

15 comments