In the 90's I was an architect on Intel's Willamette (Pentium4) thermal throttle (TT1). TT1 "knocked the teeth" out of clock cycles if the checker-retirement unit (CRU, the hottest part of the die) got too hot. This evolved into TT2/Geyserville (where you move up/down the V/F curve to actively stay under the throttle limit). We were browbeaten by upper management to prove this would not visibly impact performance and worked on one of the MANY MANY software simulators written throughout the company to prove this. (It was actually my favourite job there.) This is when the term "Thermal Design Power" arrived: top marketing brass to avoid using "Max Power" which was far higher. It is possible to have almost a 2x difference between max power (running a "power virus", which intel was terrified of from chipsets, to graphics, to CPUs) and what typical apps use (thermal design power). Performance was a bit dodgy on a few apps, but not a significant compared to run-to-run vairation. (Remember this is 1995-1997 after the half-arsed Pentium fiasco in 1993 when Motorola openly mocked intel for having a 16W CPU... FDIV wasn't thermal fiasco, but it was a proper cock up).<p>Die are sorted based on something called a bin split: die are binned immediately after wafersort based on their leakage (there are special transistors implanted near the scribe-lines that indicate tons of characteristics, as well as DFX units through out the die that are rings of 20 inverters that oscillate, also indicates tons of data on how the die behave, however testing the those buggered DFX circuits takes an enormous amount of time, and you can't slow down wafersort, so there are proxies).<p>The bins are designed in such a way to maximize profit and performance based on the die characteristics. Thermal throttle plays a role in this and each bin (among various vectors) is allowed some tolerance, which is exactly what OP has discovered. However, this has been going on for coming up on 30 years! So nothing really new here, I just thought I'd let you know that of course Intel is aware of this, and they never claim performance numbers outside of the tolerance allowed for thermal throttle.
There seems to be little attempt to ensure the ambient environment of the processors is isothermal. That is, there's likely significant chassis to chassis thermal variation that is as big or larger than any difference in thermal budget between processors.<p>Even things like the characteristics of the application of the thermal paste and the roughness of the individual fan ducts can matter, beyond the obvious bottom of rack vs. top of rack, position of processor in individual chassis, etc effects.<p>tl;dr-- probably most of what is measured is not silicon to silicon variation.
Interesting, but the discussion of "an atom here and there" affecting the performance doesn't make sense. The manufacturing variations are much larger than an atom or two. This variation is part of the motivation for "binning" of processors, testing them and then selling them at different performance levels based on how they turn out.
This isn't new. Processors are often the same template for multiple models and manufacturers "bin" based on quality and just turn off the bad parts. This is why it's more expensive to produce a nicer processor: the yield rates are much lower. There is still significant variation in models, though, known as silicon lottery. This is why some chips overclock or undervolt much better than others. There's even one site that sells chips that basically go through extra binning to ensure a better product: <a href="https://siliconlottery.com/" rel="nofollow">https://siliconlottery.com/</a>
Whether or not this article controls properly for temperature, it introduced me to violin plots: <a href="https://en.wikipedia.org/wiki/Violin_plot" rel="nofollow">https://en.wikipedia.org/wiki/Violin_plot</a>
I'm surprised that, even at the same frequency, there is still some pretty large variation. I wonder if that's due to other sources of noise that are often ignored by a lot of people running benchmarks (e.g. background processes, SMM, ME, etc.)<p><i>e.g., memory, which presumably have their own temperature characteristics</i><p>To my knowledge, and this is based on DDR3 and older; memory frequencies are essentially fixed because the transceivers on both ends need to sample in the <i>middle</i> of a bit cell, and to do that they need to know the clock period, which must not change once it's known. There's a delay-locked-loop (DLL) in the RAM which generates a phase-shifted local reference clock.<p>If the processors could all be locked to one constant frequency (i.e. all the power/performance "dynamic tuning" stuff disabled) that would help show whether there's other sources of noise. This of course also assumes the clock generators are identical.
Ugh... Had two PCs with the exact same spec (CPU, motherboard, RAM same product number, etc.), all same versions of Linux, firmware. Had 15% in performance difference between the 2 gears. Turned out to be some bios tuning, not really related to performance. 15%...
The variances are likely to be even more noticeable nowadays as modern (x86 at least) processors do a lot more automatic overclocking based on temperatures.