I think Naples is a very exciting development, because:<p>- 1S/2S is obviously where the pie is. Few servers are 4S.<p>- 8 DDR4 channels per socket is twice the memory bandwidth of 2011, and still more than LGA-36712312whateverthenumberwas<p>- First x86 server platform with SHA1/2 acceleration<p>- 128 PCIe lanes in a 1S system is unprecedented<p>All in all Naples seems like a very interesting platform for throughput-intensive applications. Overall it seems that Sun with it's Niagara-approach (massive number of threads, lots of I/O on-chip) was just a few years too early (and likely a few thousands / system to expensive ;)
This is what I have really been looking forward to. I theorycrafted a more ideal system for the genetics work a former employer was doing, but didn't get to build it until after I had left there. A quad 16 core opteron system for a total of 64 cores (for physics calculations in comsol). I think that there is more potential use for high actual core count servers than many people realize, so I can't wait to build one. (for my purposes these days is as an game server in a colo, one of my projects is a multiplayer UE4 game)<p>At the previous job where I built the 64-core system, I even emailed the AMD marketing department to see if we could do some PR campaign together, but I think it was too soon before the Naples drop, because I never got a response. Here's to hoping supermicro does a 4 cpu board for this... 124 cores would be amazing. (But I'll take 64 naples cores as long as it gets rid of the bugs and issues I found with the opterons).
I'm looking forward to the benchmarks since the performance per watt of the desktop parts (Ryzen R7) seems to be really good. Quite curious how it will compare against Skylake-EP.<p>A quote from a anandtech forum post [0] reads promising:<p>"850 points in Cinebench 15 at 30W is quite telling. Or not telling, but absolutely massive. Zeppelin can reach absolutely monstrous and unseen levels of efficiency, as long as it operates within its ideal frequency range."<p>A comparison against a Xeon D at 30W would be interesting.<p>The possibility of this monster maybe coming out sometime in the future is also quite nice: <a href="http://www.computermachines.org/joe/publications/pdfs/hpca2017_exascale_apu.pdf" rel="nofollow">http://www.computermachines.org/joe/publications/pdfs/hpca20...</a><p>[0] <a href="https://forums.anandtech.com/threads/ryzen-strictly-technical.2500572/" rel="nofollow">https://forums.anandtech.com/threads/ryzen-strictly-technica...</a>
The important thing here, from my perspective, is how NUMA-ish a single socket configuration will be. According to the article, a single package is actually made up of 4 dies, each with its own memory (and presumably cache hierarchy, etc). While trivially parallelizable workloads (like HPC benchmarks) scale quite well regardless of system topology, not all workloads do so. And teaching kernel schedulers about 2 levels of numa affinity may not be trivial.<p>With that say, I'm looking forward to these systems.
1. Most of the benchmarks are not even compiled or made with Zen Optimization in mind. But the results are already promising, or even Surprising.<p>2. Compared to Desktop / Windows Ecosystem, their are much more Open Source Software on the Server side, along with usual Open Source Compiler. Which means any AMD Zen optimization will be far easier to deploy compared to Games and App on Desktop coded and compiled with Intel / ICC.<p>3. The sweet spot for Server Memory is still at 16GB DIMMs. A 256GB Memory for your caching needs or In-memory Database will now be much cheaper.<p>4. When are we going to get much cheaper 128GB DIMM Memory? Fitting 2TB Memory per Socket, and 4TB per U, along with 128 lanes for NVM-E SSD Storage, the definition of Big Data, just grown a little bigger.<p>5. Between now and 2020, the roadmap has Zen+ and 7nm. Along with PCI-E 4.0. I am very excited!
In previous threads there was discussion about Intel processors, specifically Skylake (which is a desktop processor), being superior for server workloads involving vectorization.<p>How will Naples fare on this front?
I've long been advocating for a high i/o cpu with several pcie lanes. 128 lanes will support 8 GPUs at max bandwidth. AMD has positioned itself well.
How well does, say, Postgres scale on such hardware? Is anything more that 8 cores overkill or can we assume good linear increases in queries per second...
If they have a much better performance/$ than Intel, which they likely will have, it sounds like a good opportunity for AWS to significantly undercut Microsoft and Google (which recently bragged about purchasing expensive Skylake-E chips).
This is the first I'm reading about the 32 cores being 4 dies on a package - Not sure how well that will work out in practice. IBM does something similar with Power servers where 2 dies on a package are used for lower end chips.<p>Basically, using multiple dies increases latency significantly between the cores on different dies. This will affect performance. I will not judge till I see the benchmark though :-)
I think Naples will be a very serious threat to Intel in the server market.
As Ryzen benchmarks & reviews have shown, Zen really shines in heavy-multithreaded applications. The typical workload of a server.<p>Though I am kind of worried concerning memory access. Latency penalties when accessing non-local memory are very high on Zen CPUs due to the multi-die architecture design.<p>Does that mean we will finally see some serious interest in Shared-Nothing design and alike in the future ?
This is a multi-chip-module (MCM). Are the high core-count Xeons now all single die? Will be interesting to see what impact the MCM approach has on benchmarks as I supposed could have a latency impact in certain use cases?
can anyone chime in as to why use PCIe over something more core to core direct? As I understand it, the CPU still needs to talk to a PCIe host/bridge controller. Why not have something that is more direct between processors?