TE
テックエコー
ホーム24時間トップ最新ベスト質問ショー求人
GitHubTwitter
ホーム

テックエコー

Next.jsで構築されたテクノロジーニュースプラットフォームで、グローバルなテクノロジーニュースとディスカッションを提供します。

GitHubTwitter

ホーム

ホーム最新ベスト質問ショー求人

リソース

HackerNews APIオリジナルHackerNewsNext.js

© 2025 テックエコー. すべての権利を保有。

Strengths and limitations of diffusion language models

72 ポイント投稿者: rbanffy3日前

4 comments

cubefox3日前
That's a nice explanation. I wonder whether autoregressive and diffusion language models could be combined such that the model only denoises the (most recent) end of a sequence of text, like a paragraph, while the rest is unchangeable and allows for key-value caching.
评论 #44065044 未加载
billconan3日前
I'm curious, in image generation, flow matching is said to be better than diffusion, then why do these language models still start from diffusion, instead of jumping to flow matching directly?
评论 #44064968 未加载
mountainriver3日前
A big discussion on this happened here as well <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=44057820">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=44057820</a><p>There is quite a bit of evidence diffusion models work better at reasoning because they don&#x27;t suffer from early token bias.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;HKUNLP&#x2F;diffusion-vs-ar">https:&#x2F;&#x2F;github.com&#x2F;HKUNLP&#x2F;diffusion-vs-ar</a> <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;html&#x2F;2410.14157v3" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;html&#x2F;2410.14157v3</a>
accrual3日前
Great overview. I wonder if we&#x27;ll start to see more text diffusion models from other players, or maybe even a mixture of diffusion and transformer models alternating roles behind a single UI, depending on the context and request.
评论 #44065769 未加载