TechEcho

9 comments

mythz4 months ago

Someone also got the full Q8 R1 running on a $6K PC without a GPU on 2x EPYC with 768GB DDR5 RAM running at 6-8 tok/s [1].Will be interesting to see the value/performance compared to next gen M4 Ultra's (or Extreme?) vs NVIDIA's new DIGITS [2] when they're released.[1] <a href="https://x.com/carrigmat/status/1884244369907278106" rel="nofollow">https://x.com/carrigmat/status/1884244369907278106</a>[2] <a href="https://www.nvidia.com/en-us/project-digits/" rel="nofollow">https://www.nvidia.com/en-us/project-digits/</a>

评论 #42861550 未加载

评论 #42861646 未加载

评论 #42861366 未加载

评论 #42861692 未加载

danans4 months ago

Check out the power draw metrics. Following the CPU+GPU power consumption, it seems like it averaged 22W for about a minute. Unless I'm missing something, the inference for this example consumed at most .0004 kWh.That's almost nothing. If these models are capable/functional enough for most day-to-day uses, then useful LLM-based GenAI is already at the "too cheap to meter" stage.

评论 #42880181 未加载

teruakohatu4 months ago

I am amazed mlx-lm/mlx.distributed works that well on prosumer hardware.I don't think they specified what they were using for networking, but it was probably Thunderbolt/USB4 networking which can reach 40Gbps.

shihab4 months ago

Please note that it’s using pretty aggressive quantization (around 4 bits per weight)

评论 #42861209 未加载

rashidae4 months ago

This is amazing!! What kind of applications are you considering for this? A part from saving variable costs, fine tuning extensively and security… I’m curious to evaluate this in a financial perspective, as variable costs can be daunting, but not too much “yet”.I’m hoping NVIDIA comes up with their new consumer computer soon!

iFred4 months ago

Complete aside, but I think this is the first time I’ve seen Apple’s internal DNS outside of Apple.

评论 #42861305 未加载

creativenolo4 months ago

How is this split between two computers?

DrNosferatu4 months ago

Heavily quantized…Still interesting though.

mrcwinn4 months ago

Fascinating to read the thinking process of a flush vs a straight in poker. It's circular nonsense that is not at all grounded in reason — it's grounded in the factual memory of the rules of Poker, repeated over and over as it continues to doubt itself and double-check. What nonsense!How many additional nuclear power plants will need to be built because even these incredibly technical achievements are, under the hood, morons? XD

9 comments

mythz4 months ago

评论 #42861550 未加载

评论 #42861646 未加载

评论 #42861366 未加载

评论 #42861692 未加载

danans4 months ago

评论 #42880181 未加载

teruakohatu4 months ago

shihab4 months ago

Please note that it’s using pretty aggressive quantization (around 4 bits per weight)

评论 #42861209 未加载

rashidae4 months ago

iFred4 months ago

Complete aside, but I think this is the first time I’ve seen Apple’s internal DNS outside of Apple.

评论 #42861305 未加载

creativenolo4 months ago

How is this split between two computers?

DrNosferatu4 months ago

Heavily quantized…Still interesting though.

mrcwinn4 months ago

DeepSeek R1 671B running on 2 M2 Ultras faster than reading speed

9 comments

DeepSeek R1 671B running on 2 M2 Ultras faster than reading speed

9 comments