TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Core to core latency data on large systems

94 pointsby nuriaionover 1 year ago

7 comments

bee_riderover 1 year ago
The NUMA nature of recent* chips has made me wonder if there’s ever going to be a movement to start using message passing libraries (like MPI) on shared memory machines.<p>* actually, not even that recent, Zen planted this hope in my brain.
评论 #38183964 未加载
评论 #38184573 未加载
评论 #38186088 未加载
评论 #38184363 未加载
评论 #38194423 未加载
评论 #38185912 未加载
评论 #38185458 未加载
jauntywundrkindover 1 year ago
It&#x27;ll be interesting to see how CXL shakes out. It might end up being not much more than cross socket access! 150ns to go between sockets is about what we see here &amp; is in the realm of what CXL had been promising.<p>Having a super short lightweight protocol like CXL.mem to talk over such fast fabric has so much killer potential.<p>These graphs are always such a delight to see. It&#x27;s a network map, of how well connected cores are, and they reveal so many particular advantages and diaadvantages of the greater systems architecture.
评论 #38185619 未加载
评论 #38186934 未加载
hinkleyover 1 year ago
I was misreading these charts for too long. Maybe I still am.<p>Am I seeing that none of these processors implement a toroidal communication path? I thought that was considered basic cluster topology these days so I’m surprised that multi core chips don’t implement it.
评论 #38192005 未加载
formerly_provenover 1 year ago
It&#x27;s almost poetic to have those mid-1990s Pentiums there, with about 2-3x the inter-socket latency of the current state-of-the-art, 30 years later.
undersuitover 1 year ago
I like the end of the article.<p>&gt;If Pentium could run at 3 GHz and the FSB got a proportional clock speed increase, core to core latency would be just over 20 ns.<p>Ran the test against my closest equivalent.<p>CPU: Intel(R) Celeron(R) G5905T CPU @ 3.30GHz Num cores: 2 Num iterations per samples: 5000 Num samples: 300<p>1) CAS latency on a single shared cache line<p><pre><code> 0 1 0 1 25±0 Min latency: 25.3ns ±0.2 cores: (1,0) Max latency: 25.3ns ±0.2 cores: (1,0) Mean latency: 25.3ns </code></pre> Just wish I had a dual socket Pentium for the last 40 years.
nwmcsweenover 1 year ago
If I&#x27;m reading this right socket-to-socket latency hasn&#x27;t really improved much in a long time, why?
gpderettaover 1 year ago
Very interesting. Now do bandwidth next!