Pretty cool, but a number of the questions are totally unknowable.<p>For instance the question about web requests to google. Depending on your internet connection you've got more than a order of magnitude difference in the outcome.<p>In the question about SSD performance the only hint we have is that the computer has "an SSD", but a modern PCIe SSD like in the new Macbook pro is over 10 times faster than the SSDs we got just 5 years ago.<p>The question about JSON/Msgpack parsing is just about the implementation. Is the python msgpack library a pure python library or is the work of the entire unpackb() call done in C?<p>The bcrypt question depends entirely on the number of rounds. The default happens to be 12. Had the default been 4 the answer would have been 1000 hashes a second instead of 3. Is the python md5 library written in C? If so, the program is indistinguishable from piping data to md5sum from bash. Otherwise it's going to be at least an order of magnitude slower.<p>So I liked these exercises, but I liked the C questions best because there you can look at the code and figure out how much work the CPU/Disk is doing. Questions that can be reduced to "what language is this python library written in" aren't as insightful.
Yes, modern computers are fast. How fast?<p>The speed of light is about 300,000 km/s. That translates to roughly 1 ns per foot (yeah, I mix up my units... I'm Canadian...)<p>THUS, a computer with a clock speed of 2 GHz will be able to execute, on a single core/thread, about 4 (four !) single-clock instructions between the moment photons leave your screen, and the moment they arrive into your eye 2 feet (roughly) later.<p>_That_ should give you an idea of how fast modern computers really are.<p>And I _still_ wait quite a bit when starting up Microsoft Word.
If, like me, you spend most of your time in high-level, garbage collected "scripting" languages, it's really worth spending a little time writing a few simple C applications from scratch. It is <i>astonishing</i> how fast a computer is without the overhead most modern languages bring in.<p>That overhead adds tons of value, certainly. I still use higher level languages most of the time. But it's useful to have a sense of how fast you <i>could</i> make some computation go if you really needed to.
Alternatively, this could be titled "do you know how much your computer <i>could</i> do in a second but isn't because of bad design choices, overengineered bloated systems, and dogmatic adherence to the 'premature optimisation' myth?"<p>Computers are fast, but not if all that speed is wasted.<p>A recent related article: <a href="https://news.ycombinator.com/item?id=13940014" rel="nofollow">https://news.ycombinator.com/item?id=13940014</a>
Be careful what conclusions you attempt to draw from examples when you arent sure what exactly is happening. These examples are actually very wrong and misleading.<p>Take for example, the first code snippet about how many loops you can run in 1 second. The OP fails to realize that since the loop isnt producing anything which gets actually used, the compiler is free to optimize it out. You can see that thats exactly what it does here: <a href="https://godbolt.org/g/NWa5yZ" rel="nofollow">https://godbolt.org/g/NWa5yZ</a> All it does is call strtol and then exits. It isnt even running a loop.
More impressively, sum.c could go likely an order of magnitude or so faster, when optimized.<p>> Friends who do high performance networking say it's possible to get network roundtrips of 250ns (!!!),<p>Well stuff like Infiniband is less network, and more similar to a bus (e.g. RDMA, atomic ops like fetch-and-add or CAS).<p>> write_to_memory.py<p>Is also interesting because this is dominated by inefficiencies in the API and implementation and not actually limited by the memory subsystem.<p>> msgpack_parse.py<p>Again, a large chunk goes into inefficiencies, not so much the actual work. This is a common pattern in highly abstracted software. msgpack-c mostly works at >200 MB/s or so (obviously a lot faster if you have lots of RAWs or STRs and little structure). Funnily enough, if you link against it and traverse stuff, then a lot of time is spent doing traversals, and not the actual unpacking (in some analysis I've seen a ~1/3 - 2/3 split). So the cost of abstraction also bites here.<p>If you toy around with ZeroMQ you can see that you'll be able to send around 3 million msg/s between threads (PUSH/PULL) from C or C++, around 300k using pyzmq (this factor 10 is sometimes called "interpreter tax"), but only around 7000 or so if you try to send Python objects using send_pyobj (which uses Pickle). That's a factor 430.
What an excellent teaching pattern - you're far more likely to remember what you learned if you first stop to think and record your own guess, and this is excellent UI and UX for doing that routinely and inline.
This is awesome. The real lesson here is, when you make a thing, compare its performance to these kinds of expected numbers and if you're not within the same order of magnitude speedwise, you've probably screwed up somewhere.<p>My favorite writeups are the ones that gloat about achieving hundreds of pages served per second per server. That's terrible, and nobody today even understands that.
Don't some of these examples run in O(1) time because the value in the loop isn't used? E.g in the first example 0 is returned instead of the sum.<p>Obviously we are talking about real world c compilers with real world optimizations so presumably we'd have to also consider whether the loop is executed at all?
That's nothing. Here's code that does 77GFLOPS on a single Broadwell x86 core. Yes that 77 billion opertaions per second.<p><a href="http://pastebin.com/hPayhGXP" rel="nofollow">http://pastebin.com/hPayhGXP</a>
This reminds me of "Latency Numbers Every Programmer Should Know"<p><a href="https://gist.github.com/jboner/2841832" rel="nofollow">https://gist.github.com/jboner/2841832</a><p>Edit: Just realized halfway through that there's already a link to this from their page!
Hard to believe there are 124 comments here and nobody has brought up Grace Hopper's talk[0][1] yet. With good humour she gives a example of what various devices' latency are, and a simple tool to comprehend the cost and orders of magnitude.<p><pre><code> [0] short - https://www.youtube.com/watch?v=JEpsKnWZrJ8
[1] long - https://www.youtube.com/watch?v=ZR0ujwlvbkQ</code></pre>
I'm curious to see the data collected on guesses. Some were quite difficult to guess, like hashes per second with bcrypt not knowing the cost factor, but I guess we can assume some sane default.<p>I would have really liked to see all these numbers in C, and other languages for that matter. Perhaps add a dropdown box to select the language from a handful of options?
One second on what?<p>A Core i7? A raspberry Pi? A weird octo-core dual-speed ODROID? An old i915-based Celeron? My cell phone? An arduino?<p>"Your computer" has meant all the above to me, just in the last few weeks. The author's disinclination to describe the kind of hardware this code is running on -- other than "a new laptop" -- strikes me as kind of odd.
This reminds me to this email from LuaJIT's list:<p>Computers are fast, or, a moment of appreciation for LuaJIT
<a href="https://groups.google.com/forum/#!msg/snabb-devel/otVxZOj9dLA/rgCojUohBGMJ" rel="nofollow">https://groups.google.com/forum/#!msg/snabb-devel/otVxZOj9dL...</a>
Brilliant! I'd like to see those numbers summarized somewhere though, a bit like the latency numbers every programmer should know: <a href="https://gist.github.com/jboner/2841832" rel="nofollow">https://gist.github.com/jboner/2841832</a> (visual: <a href="https://i.imgur.com/k0t1e.png" rel="nofollow">https://i.imgur.com/k0t1e.png</a>)
I came across this "article"? before in the past, I feel like I remember it under a different title like "language speed differences" or something. Or maybe that's another article by the same author/site/format.
The grep example should search for one character. Grep can skip bytes so that longer search strings are faster to search for. On my machine, I get from 22%-35% more time taken if I changed "grep blah" to "grep b".
Or "how fast can one of my 8 CPU cores run a for loop?" To put that in perspective: all 8 cores together give me about 40gflops. I have 2 GPUs that each give me more than 5000gflops.
Anyone care to rewrite these into c#? I am really surprised how fast these python scripts are and I would like to see comparison with equivalent tasks in c# where it stands..
Was disappointed to find that nearly all the examples were Python and shell script. I'm not interested in knowing random trivia about how slow various interpreters are.
Well, my computer won't display an image apparently inserted with JavaScript, although it <i>could</i> if I wanted to grant execute privileges on it to computers-are-fast.github.io<p>Does anyone have a link to the image(s)?