HPC may look like COTS gear but it's not.<p>BSD doesn't have drivers for Infiniband and other HPC interconnects. Nor does it have client drivers (let alone server implementation) for Lustre which is the distributed filesystem used by most super computers.<p>I imagine MPI support on BSD is also likely non-existent.
Then there is the matter of accelerator support, i.e NVidia GPUs and Intel Xeon Phi.<p>It's not to say that some vendor couldn't reasonably build a BSD based supercomputer, it's just highly unlikely given how much stuff is missing.
Because of one man: Donald Becker. At the beginning of the commodity super computer era Donald did an absolutely amazing job squeezing out every last bit of performance from commodity networking hardware for 'Beowulf' style clusters.<p>This gave Linux a head start and the self-reinforcing effects of such a head start did the rest, it made answering the question 'for which OS should we start writing drivers?' for specialty HPC hardware a no-brainer.<p><a href="https://en.wikipedia.org/wiki/Donald_Becker" rel="nofollow">https://en.wikipedia.org/wiki/Donald_Becker</a>
Scientific high energy physicist here with regional HPC center on the same floor. My observation is that administrators tend to enterprise distributions such as Scientific linux, Suse linux enterprise server (SLES), together with commercial MPI implementations such as IBM MPI and Intel MPI.<p>On the other hand, people are used to Linux, in my environment literally everybody has Ubuntu on their notebook and workstation. They know how to run their python analysis scripts there and the only thing they have to change when going to the cluster is the adoption of an environment managament system (such as <a href="http://modules.sourceforge.net/" rel="nofollow">http://modules.sourceforge.net/</a>).<p>(However, I have to admit I never got in touch with BSD and don't know the differences in user space)
A comment I found while looking at linux OS that are run on super computers.<p>"Originally, the top 500 list was populated entirely by proprietary Unix systems from vendors like Cray research, SGI, etc.<p>In June 1998, the first Linux system entered the top 500 list. By June 2003, Linux systems passed the 25% mark, accounting for 139 of the top 500. By November of 2003, Linux systems comprised over 56% of the top 500. By November 2006, Linux made up more than 75% of the top 500. You get the idea. Over the years, there were a few attempts by microsoft to get into supercomputing, and there were BSD and Mac systems."<p>Since time is sold on these supercomputers they probably want to run all the same/similar OS so they can compete selling time on them. Also if one person has success everyone else will copy them.<p><a href="https://linux.slashdot.org/story/17/11/14/2223227/all-500-of-the-worlds-top-500-supercomputers-are-running-linux" rel="nofollow">https://linux.slashdot.org/story/17/11/14/2223227/all-500-of...</a><p>Slashdot has a ton of comments discussing bsd vs linux on this subject matter, but I didn't see anything to helpful.<p>My only thought is large companies like netflix use bsd more for CDN because from what I been told bsd has the best I/O handling. Why they don't use it for the rest of there infrastructure? Maybe linux is better at crunching numbers and bsd is better for network and security? No idea thats my best guess.
I think, largely, the same reasons apply to Linux vs. BSD in supercomputers as Linux vs. BSD generally. You might as well ask why Linux and not *BSD is used in Android, on servers generally, or by large technical knowledgeable organizations such as Google, Amazon, Facebook, etc.<p>So, in no particular order:<p>- Linux came on the scene when BSD's were mired in legal uncertainty. After the legal issues were settled, Linux had already become the default choice for someone wanting a FOSS Unix-style kernel, and the BSD's never caught up.<p>- The GPL license meant that improvements were shared rather than squirreled away in various proprietary spin-offs and thus lost when whatever company was behind them folded (generally, exceptions going both ways surely exist!).<p>- Due to Linux gaining the initial momentum, developers flocked (and keep flocking!) to it, leaving the BSD's ever further behind.<p>- Linux was more welcoming to new contributors, whereas the BSD's were controlled by a small circle of core developers sitting on the commit access. And of course, the BSD way of solving disagreements was forking the entire thing, further splitting up the already small developer base.
Some will say better hardware support.
While Linux has better hardware support, i usually find this to be in the more exotic direction.<p>I think it's simply down to Linux being where the money is.
The big players (IBM, Dell, etc) are all actively promoting Linux, and trained personel is also somewhat easy to find.
So Linux is "the beast you know".<p>As for FreeBSD, it might be a technically better platform, but it is living in Linux' shadow.<p>Personally i run FreeBSD for the excellent documentation, stability, features like ZFS, but nothing i run couldn't just as easily run on Linux.
Because BSD's SMP support has traditionally been pretty terrible compared to Linux's. They still have a SLAB memory allocator (compared with Linux's default of SLUB which is much better for heavily SMP systems).<p>Many of the vendors for HPC (I'm looking at you Mellanox) primarily develop and certify their products on Linux. While they might work on BSD ok, you're not going to get the full performance and all of the features on a BSD system. If you paid for Mellanox EDR 100G Infiniband switches and all of the fancy VPI network cards, you want to use them to the fullest performance capable. The vendor tells you to use Linux for that, you use Linux.<p>TL;DNR: Linux is what the hardware manufacturers overwhelmingly target and work with. HPC users use what vendors support best.
Because the people who use supercomputers just want to crunch numbers - the operating system is a distraction at best, and Linux is the path of least resistance.
Because BSDs don't scale in that direction.<p>Dragonfly's design shows promise, but it's not anywhere near ready for supercomputers yet.
I wonder if this is another validation of the "worse is better" philosophy as described recently in an HN article <a href="http://minnie.tuhs.org/pipermail/tuhs/2017-May/009935.html" rel="nofollow">http://minnie.tuhs.org/pipermail/tuhs/2017-May/009935.html</a> and also discussed at <a href="https://www.jwz.org/doc/worse-is-better.html" rel="nofollow">https://www.jwz.org/doc/worse-is-better.html</a>
I would guess it is because linux has a wider hardware support than bsds. As you're building a supercomputer it makes sense to have the faster hardware, implying they are new technology.
This might shed some light on your question:
<a href="https://en.wikipedia.org/wiki/Comparison_of_operating_system_kernels" rel="nofollow">https://en.wikipedia.org/wiki/Comparison_of_operating_system...</a>