TechEcho

16 comments

Interesting to think about where the cost will go in a few years.I remember in college intro to CS class back in 1998, where I heard the story of building the first computer that could perform at 1 TFLOPS[1]. It cost $46 million and took up 1600 square feet. Now a $600 Mac Mini will do double that.[1] <a href="https://en.wikipedia.org/wiki/ASCI_Red" rel="nofollow">https://en.wikipedia.org/wiki/ASCI_Red</a>

评论 #34525916 未加载

评论 #34526418 未加载

wokwokwokover 2 years ago

Is this just an ad for a service?They didn’t make anything.This is just speculative benchmarking.I am deeply not interested in multiplying the numbers on your pricing sheet by the estimated numbers on the stable diffusion model card.I have zero interest in your (certainly excellent) Proprietary Special Sauce (TM) that makes spending money on your service a good idea.This just reads as spam that got past the spam filter.Did you actually train a diffusion model?Are you going to release the model file?Where is the actual code someone could use to replicate your results?Given the lack of example outputs, I guess not.

评论 #34526508 未加载

abeppuover 2 years ago

> *256 A100 throughput was extrapolated using the other throughput measurements.It seems worth noting that the $160k scenario wasn't actually measured.

评论 #34526099 未加载

epicycles33over 2 years ago

Glad to see this - you can even get reasonable-ish results on lower res images with ~2 hours train time on a P100 GPU. See my try here: <a href="https://www.kaggle.com/code/apapiu/train-latent-diffusion-in-keras-from-scratch" rel="nofollow">https://www.kaggle.com/code/apapiu/train-latent-diffusion-in...</a>

gedyover 2 years ago

Still pretty pricey for average person, but these will trend cheaper and why I think it's futile to "regulate" AI. Someone somewhere will train models on anything visible to public, licensed or not. Feels like Pandora's box has been opened and we need to deal with it.

评论 #34525363 未加载

评论 #34525751 未加载

pizzaover 2 years ago

5 bucks says within a year there’ll be some innovation that shrinks this by 2 orders of magnitude. Either from much cheaper compute cost (eg OPUs) or much more efficient training. Hell, there ought to be some way to leapfrog these innovations in such a way that the huge model of yesteryear becomes a more powerful optimizer/loss function itself. That’d just about solve the “hands off my unique shapes!” problem of acceptable training data trawling too :)

odyssey7over 2 years ago

How many tries does it take for an expert to succeed at training a custom Stable Diffusion?

ipsum2over 2 years ago

Note that this doesn't take into account the numerous iterations required to dial in the correct hyperparameters and model architecture, which could easily increase cost 5-10x.> 256 A100 throughput was extrapolated using the other throughput measurementsIs it an indictment of their service that they couldn't afford 256 GPUs on their own cloud?

评论 #34526213 未加载

choxiover 2 years ago

Data truly is the new oil. When it’s all done the compute costs and code will be cheap or free. There’s a lot hinging on how we interpret copyright laws or what kind of data rights laws we enact.

评论 #34526316 未加载

评论 #34526093 未加载

rektideover 2 years ago

This task requires a bit more work than I'd want, but I'd also point out $100k can buy ~9 A100's which are good for ~7k hours of work a month (through not entirely reputable channels, so there's a chance some might die earlier or might have to be returned). That might not train Stable Diffusion in a fast enough time for you (~50k hours estimated training time), but it's still damned impressive. And you can keep the hardware.I wonder if AMD is as over-the-top brutal with legal control over where their GPUs can be used as Nvidia is. Maybe with energy cost you might possibly still want to stick with the A100's anyways, but you can afford quite a lot of RX 7900's with $100k (if you can find em).

xnxover 2 years ago

It's interesting to compare the cost for cloud GPU's vs. buying the hardware outright. At ~$10,000 per Nvidia A100 GPU, it seems like this cloud provider would break even on the hardware after about 5 months at these rates. There are certainly other costs involved (racking, power, etc.), but that's not too bad. I'm almost surprised Nvidia doesn't cannibalize it's hardware sales by running its own cloud.

coding123over 2 years ago

There are some large AWS customers that probably burn that in idle time on a bunch of unused machines per week (probably day).

评论 #34526051 未加载

mensetmanusmanover 2 years ago

This ignores all the runtime costs for LLMs that aren’t operating effectively :)

marcoolivover 2 years ago

Is there any value of this cost that we can say "this is dangerous". For any reason?

评论 #34525453 未加载

xwdvover 2 years ago

We can do it for way less using spot instances on AWS, though it takes longer.

评论 #34526068 未加载

grapheover 2 years ago

I've downloaded anime models for free. I'm sure they were <$160 without the k. <a href="https://github.com/Noah670/stablediffusionAnime">https://github.com/Noah670/stablediffusionAnime</a>

评论 #34525691 未加载

16 comments

mullingitoverover 2 years ago

评论 #34525916 未加载

评论 #34526418 未加载

wokwokwokover 2 years ago

评论 #34526508 未加载

abeppuover 2 years ago

> *256 A100 throughput was extrapolated using the other throughput measurements.It seems worth noting that the $160k scenario wasn't actually measured.

评论 #34526099 未加载

epicycles33over 2 years ago

gedyover 2 years ago

评论 #34525363 未加载

评论 #34525751 未加载

pizzaover 2 years ago

odyssey7over 2 years ago

How many tries does it take for an expert to succeed at training a custom Stable Diffusion?

ipsum2over 2 years ago

评论 #34526213 未加载

choxiover 2 years ago

Data truly is the new oil. When it’s all done the compute costs and code will be cheap or free. There’s a lot hinging on how we interpret copyright laws or what kind of data rights laws we enact.

评论 #34526316 未加载

评论 #34526093 未加载

rektideover 2 years ago

xnxover 2 years ago

coding123over 2 years ago

There are some large AWS customers that probably burn that in idle time on a bunch of unused machines per week (probably day).

评论 #34526051 未加载

mensetmanusmanover 2 years ago

This ignores all the runtime costs for LLMs that aren’t operating effectively :)

marcoolivover 2 years ago

Is there any value of this cost that we can say "this is dangerous". For any reason?

评论 #34525453 未加载

xwdvover 2 years ago

We can do it for way less using spot instances on AWS, though it takes longer.

评论 #34526068 未加载

grapheover 2 years ago

I've downloaded anime models for free. I'm sure they were <$160 without the k. <a href="https://github.com/Noah670/stablediffusionAnime">https://github.com/Noah670/stablediffusionAnime</a>

评论 #34525691 未加载

Training Stable Diffusion from Scratch Costs <$160k

16 comments

Training Stable Diffusion from Scratch Costs <$160k

16 comments