TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

StreamDiffusion: A pipeline-level solution for real-time interactive generation

365 点作者 Flux159超过 1 年前

11 条评论

Flux159超过 1 年前
Arxiv paper here <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2312.12491" rel="nofollow noreferrer">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2312.12491</a><p>I think that it&#x27;s possible to get faster than their default timings for a 4090 (I have been able to get 10fps without optimizations with SDXL Turbo and 1 iteration step), but their other improvements like using a Stochastic Similarity Filter to prevent unnecessary generations are good for getting fast results w&#x2F;out having to pin your GPU at 100% all the time.
acheong08超过 1 年前
This feels unreal. It feels like a decade passed within a year.
评论 #38751150 未加载
评论 #38750124 未加载
评论 #38750949 未加载
评论 #38749968 未加载
评论 #38750651 未加载
评论 #38751815 未加载
smusamashah超过 1 年前
I just tried the realtime-text2img demo (uses npm for frontend which i think is too much for this). Modified it to produce only 1 image instead of 16. Works well on a laptop with RTX-3080. It&#x27;s probably 2 images &#x2F; sec.<p>EDIT: The `examples\screen` demo almost feels realtime. Says 4 fps on the window but don&#x27;t what it represents.<p>EDIT: Denoising in img2img is very low though which means thee returned image is only slightly different from base image.
评论 #38751431 未加载
modeless超过 1 年前
Does 100fps mean I can provide a new input every 10 ms and get a new output every 10ms? Or do inputs need to be batched together to get that average throughput?
评论 #38750131 未加载
kristopolous超过 1 年前
This more or less just worked as documented. Most of these demos tend to blow up and give really wonky deep errors.<p>Good job. Give it a try. Look into the server.py of realtime-txt2img to change the model if you want to generate something other than anime. Pointing it to say <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;runwayml&#x2F;stable-diffusion-v1-5" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;runwayml&#x2F;stable-diffusion-v1-5</a> works fine.<p>The results are genuinely fast. Not great, but fast. If you change to the SDXL via LCM-LoRA <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;latent-consistency" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;latent-consistency</a> you may get better stuff but that&#x27;s when it&#x27;s going to get difficult and you&#x27;ll start to run into those mysterious crashes I talked about that require, you know, actual work.<p>my setup: 4090&#x2F;3990x&#x2F;CUDA 12.2&#x2F;debian sid. ymmv.
ilaksh超过 1 年前
How does the demo with the girl moving in and out of frame work? Is it ControlNet?
评论 #38749694 未加载
评论 #38749718 未加载
评论 #38749831 未加载
_joel超过 1 年前
Maybe we&#x27;re all living in a simulation^H^H^H^H^H pipeline-level solution for real-time interactive generation.
评论 #38753683 未加载
brcmthrowaway超过 1 年前
What is the fps on Apple Silicon?
评论 #38750339 未加载
评论 #38751636 未加载
评论 #38749840 未加载
timexironman超过 1 年前
Is there a video of it I can view anywhere?
评论 #38753706 未加载
badloginagain超过 1 年前
Yo I just heard about MidJourney this year.<p>And this appears to be a local runtime stable diffusion streaming library?<p>Bruh.
评论 #38749941 未加载
评论 #38753712 未加载
programjames超过 1 年前
This paper is horribly written. It&#x27;s like the authors are trying to sell me on them as researchers, instead of helping me understand their research (y&#x27;know, the entire reason journals got started??). An entire section for &quot;stream batching&quot; was just too much, and none of their ideas were innovative or unique. It was incredibly dense, simply because it&#x27;s obfuscated, which makes me believe the authors themselves don&#x27;t really understand what they&#x27;re doing.<p>The results aren&#x27;t even very good. They claim 60x speedup, but compared to what? HuggingFace&#x27;s Diffusers Autopipeline... a company notorious for buggy code and inefficient pipelines. And that&#x27;s for <i>naively</i> running the pipeline on every image. Give me a break.
评论 #38751861 未加载
评论 #38750513 未加载