TechEcho

8 comments

L40S has 48GB of RAM, curious how they're able to run Llama 3.1 70B on it. The weights alone would exceed this. Maybe they mean quantized/fp8?I just had to implement GPU clustering in my inference stack to support Llama 3.1 70b, and even then I needed 2xA100 80GB SXMs.I was initially running my inference servers on fly.io because they were so easy to get started with. But I eventually moved elsewhere because the prices were so high. I pointed out to someone there that e-mailed me that it was really expensive vs. others and they basically just waved me away.For reference, you can get an A100 SXM 80GB spot instance on google cloud right now for $2.04/hr ($5.07 regular).

评论 #41262626 未加载

nknealk9 months ago

> You can run DOOM Eternal, building the Stadia that Google couldn’t pull off, because the L40S hasn’t forgotten that it’s a graphics GPU.Savage.I wonder if we’ll see a resurgence of cloud game streaming

评论 #41263114 未加载

评论 #41262538 未加载

评论 #41262477 未加载

评论 #41263086 未加载

deepsquirrelnet9 months ago

I hadn’t even heard of L40S until I started renting to get more memory for small training jobs. I didn’t benchmark it, but it seemed to be pretty fast for a pcie card.Amazon’s g6 instances are L4-based with 24gb vram, half the capacity of the L40S, with sagemaker in demand prices at this rate. Vast ai is cheaper, though a little more like bidding and varying in availability.

CGamesPlay9 months ago

> You can run Llama 3.1 70B — the big Llama — for LLM jobs.That's the medium Llama. Does anyone know if an L40S would run the 405B version?

评论 #41262314 未加载

tazu9 months ago

Prices lowered to $1.25/hr... still 2X vast.ai prices.

评论 #41262442 未加载

评论 #41263217 未加载

layoric9 months ago

Not as fast as the L40S, but Runpod.io has the A40 48gb for $0.28/hr spot price, so if its mainly VRAM you need, this is a lot cheaper option. Vast.ai has it for the same price as well.

评论 #41262616 未加载

blindriver9 months ago

Suddenly cutting prices in half shows that the business model is in dire straits.

评论 #41262808 未加载

gedw999 months ago

they buy them at 12 K, so they pay them off in 1 year approxnice business to be in I guess.

评论 #41268274 未加载

8 comments

zackangelo9 months ago

评论 #41262626 未加载

nknealk9 months ago

评论 #41263114 未加载

评论 #41262538 未加载

评论 #41262477 未加载

评论 #41263086 未加载

deepsquirrelnet9 months ago

CGamesPlay9 months ago

> You can run Llama 3.1 70B — the big Llama — for LLM jobs.That's the medium Llama. Does anyone know if an L40S would run the 405B version?

评论 #41262314 未加载

tazu9 months ago

Prices lowered to $1.25/hr... still 2X vast.ai prices.

评论 #41262442 未加载

评论 #41263217 未加载

layoric9 months ago

Not as fast as the L40S, but Runpod.io has the A40 48gb for $0.28/hr spot price, so if its mainly VRAM you need, this is a lot cheaper option. Vast.ai has it for the same price as well.

评论 #41262616 未加载

blindriver9 months ago

Suddenly cutting prices in half shows that the business model is in dire straits.

评论 #41262808 未加载

gedw999 months ago

they buy them at 12 K, so they pay them off in 1 year approxnice business to be in I guess.

评论 #41268274 未加载

We're Cutting L40S Prices in Half

8 comments

We're Cutting L40S Prices in Half

8 comments