I wonder if it's counterproductive to tell <i>every programmer</i> they <i>should know an entire book</i> about SSDs (or DRAM, cf. Drepper), especially when the later chapters tell you that you have no control over the stuff you learned in the earlier chapters due to the FTL and filesystem.
It's a stretch to say every programmer should know about this stuff when most apps aren't I/O limited. For my purposes, I care that SSDs are fast, not about the underlying architecture of the NAND, page sizes, or drive controller. I also know absolutely zilch about the inner-workings of the Linux kernel and don't plan on reading a book about it.<p>The nice thing about so much division of interest in the engineering field is that you don't always need to know everything about what's going on under the hood to use a technology to build other cool stuff. Only in specific situations is that knowledge really necessary.
I feel like the "summary of the summary" from the comments is the best.<p><pre><code> Thank you for detailing this stuff. I had a quick read.
For the sake of other readers, if you don’t have time to read Part 6, here is the summary of the summary:
If you are using a decent OS like linux, as opposed to direct hardware access (who other than google ever did that?), all you need to worry about is how to co-locate your data. By “using”, I don’t mean using O_DIRECT, which basically says, “OS, please get out of my way.”
If you do use O_DIRECT, do “man 2 open” and search the man page for “monkey”. If that’s not enough, google “Linus O_DIRECT” and look at the war with Oracle and the like. You could probably also google “Linus Braindead” and you are likely to find a post about O_DIRECT.
If you are one of the few people actually working on the kernel’s disk layer, you probably know all this and is unlikely you will ever be reading this.
As for co-locating data, there is no way to do it without knowing your app inside out. So know your app. For some apps, it can make orders of magnitude difference. That’s your domain. Leave the disk to the kernel. Let it use it’s cache and it’s scheduler. They will be better than yours and will benefit the whole system, in case other apps need to get at the data too. You can try to help the kernel by doing vectorized read/writes although that’s a bigger deal with spinning disks.
Ata Roboubi</code></pre>
This guide needs a part on "How an SSD fails" and explanations on what to expect from SSD failures and especially the soft-errors that may occur.<p>I also can't stress enough that SSDs are born for multi-threaded. If you aren't issuing multiple reads and multiple writes you are doing it wrong. If your data is sequential it is ok to send it in a single operation. If you can however issue multiple commands, do that. You'll have a better chance of getting all of them fulfilled at the same time.
The main thing is to make sure that there are a lot of commands in the queue, then the drive can take care of most things. SATA has a queue depth of around 32, NVMe much more. If you benchmark (eg with fio) you will see that you get nothing like the rated iops at low queue depths. Getting enough stuff queued is quite hard - fio is useful to test as it gives multiple strategies but you need quite a lot of threads in a real application, or use Linux aio with O_DIRECT.
Does anyone actually do #27, "27. Over-provisioning is useful for wear leveling and performance" It's been my experience mucking with the innards of flash devices that they already have ~10% more NAND in them than their labeled capacity exactly for this purpose. Seems like over provisioning is bad advice unless you have a very special situation.
I don't think any programmer needs to know more than what this information means practically to a programmer. I don't need the details - I have other things to worry about.<p>SSD is great because when we compile we are opening potentially hundreds of files at once. SSD is great for parallel access and blows away spinning drives.<p>Also, as a programmer I replace my computer every 5 years. More often than that is too much of a time sink. I may replace some parts in those 5 years. For example I upped the RAM and replaced the harddrive with an ssd on my 2009 MacBook pro. But in 2015 I purchased a MacBook pro with 16gb ram and a 1tb ssd.<p>as a programmer I want to focus on the programming.<p>Edit: sorry, I read the article but didn't quickly understand that this is a summary chapter of a book written for programmers that actually do have to worry about ssd access vs normal hard drives. It makes much more sense in this context.