I actually ran into this last year. We ordered about 2 thousand 8TB HDDs last year for a Ceph cluster and had an agreement with our server vendor that they are not supposed to substitute the HDD model. Well, seems like they ran out and a small fraction of the drives were of a different, older model. The performance difference was 2.5x for our workload:<p>Older model: 670 IO/s and 2.9 ms average latency<p>Newer model: 1680 IO/s and 1.1 ms average latency<p>We got the vendor to send us new drives and shipped the older model back to them.
> because of course disk vendors quote everything in the smaller SI units).<p>SI prefixes should be (and are) the rule rather than the exception. It only made sense to measure ram in base-2 units because they are manufactured according to base-2 addressing.<p>HDD’s have never used base 2 addressing.<p>Another example is network media. Serialized data transfer speed is measured in {K,M,G,T}b/s (bits) or B/s (Bytes) where a Gigabit is exactly 1000000000 bits and Gigabyte is exactly 1000000000 bytes.<p>In the early days of computers ram was extremely scarce and expensive. It was such a limitation that it was usually the very first question you asked about a computer’s capabilities.<p>I remember in 5th grade asking a fellow student who’s parents had just bought an Atari 800 “How much ‘K’ does it have?”<p>This intense focus on ram may have led people to assume that it’s odd nomenclature would apply to anything computer related.
Fun fact: The dimensions of a magnet (a bit of data) on an HDD are actually rectangular. It's about 3-5x wider than it is long. This favors read/write speed, since you cover more bits per rotational length. Aside from the obvious speed boost you get from it, it's also done to make it easier to stay on track.<p>One of the huge limitations with bit width is the ability to keep the actuator arm stable within the 15 or so nanometers because of the air turbulence. That's why they fill the drives with helium now, less aerodynamic drag/turbulence means you can either cram more platters in (which increases the turbulence back to where it originally was) or make narrower tracks since you have more precision.
230MB, 100MB, or 50MB per second doesn't make that big a difference for anything I'd use a HDD for. What I want is lower pricing. The 12TB drive I bought two years ago hasn't changed in price. The 8TB drive I bought four years ago is $30 more expensive now than it was then.
This is the result of "more bits-per-inch", which is the natural result of going from 4TB-per-drive to 20TB-per-drive.<p>Yes, I'm skipping a few steps, but that's the fundamental issue at play here. There's more platters, more bits per inch, more read / write speed at the same 7200 RPM.<p>Its not a lot, but its a steady increase as hard drives keep getting denser and denser.
Is the status quo on drive firmware lying about what's actually been flushed to media still as terrible as it was, oh, twenty years ago?<p>A friend used to work for Apple and said that one of the reasons they had apple-specific versions of various mass market SCSI and IDE drives was because Apple's firmware actually flushed data to disk when told to do so, and didn't lie about whether data was flushed or not.
The other interesting thing about modern data centre HDDs is they have a media cache which significantly accelerates synchronous random write IOPS - OOTB these drives get 75 IOPS with fio fsync=1 bs=4k randwrite, but with a WCE=0 they can do 400 IOPS.<p>So if your app is fsync heavy (such as Ceph) then you can switch on this media cache by setting the drive to write through mode (WCE=0).<p>SATA SSDs have a similar quirk.
I'm never buying another HDD in my life. There are a few tasks that take weeks on an HDD but only hours on an SSD. And the price difference is often negligible.
Using a single Prometheus node with redundant drives rather than redundant nodes with single drives is an odd choice to me. Why replicate at block level when layer 7 has support?