Don’t assume that safety comes for free: a Swift case study

69 pointsby deafcalculusover 8 years ago

13 comments

This seems totally misguided and inflammatory. The points made boil down to:1. Overflow checking adds about a 2x overhead to "+" in a loop2. Using "reduce" rather than a loop also has ~2x overheadNeither is really surprising, and hardly seems to warrant the final tangent that "I do not think it is universally a good thing that software should crash when an overflow occurs" since "a mission-critical application could crash because an unimportant function in a secondary routine overflows".What about the mission-critical bugs, especially security bugs, that are caused by overflows?Swift's default behavior, of aggressively checking for overflows, is at least arguable, and there's a school of thought in software engineering that says "don't nail your code to the wall", i.e. don't try to keep it running at all costs when something goes wrong. Let it crash and restart it. Obviously that approach works well in some scenarios and less well in others. The creators of Swift reckon it's a good default for most programmers and I reckon they're onto something.The fact that the overhead is only 2x for a tight inner loop should be a cause for celebration, not dire warnings. And as the author himself shows, it's very easy to disable overflow checking on a per-operation basis when you want to.

评论 #13117030 未加载

评论 #13117608 未加载

评论 #13118436 未加载

评论 #13117687 未加载

评论 #13117075 未加载

评论 #13116823 未加载

评论 #13116928 未加载

评论 #13116821 未加载

评论 #13118322 未加载

bjourneover 8 years ago

Most costly software bug I've ever witnessed was caused by an integer overflow. Thankfully, it wasn't caused by me, but if I had been auditing the code, I probably wouldn't have found the bug.The system was billing customers credit cards depending on how long they had used the service. Time was measured in milliseconds (uh-oh!) for no apparent reason. Usage could have been measured in seconds or even days but someone thought it was good to be extra precise. And System.currentTimeMillis() returns milliseconds.The default charging period was 14 days which worked well. So the maximum number of milliseconds that could be charged (for someone who used the system the whole month) was 1,209,600,000. Then the company decided to change the period to every two months (60 days) instead to save money as there was a fixed cost added to every credit card transfer.Guess what 60 * 24 * 3600 * 1000 is? It's a number a bit bigger than 2^31 - 1 which is the most positive primitive integer value in Java. And the "totalDuration" variable had type "int". :)So totalDuration wrapped around which caused the system to retry the transaction over and over and debit customers hundreds of times more than what they really owed. The resulting fallout from that debacle was one of the reasons the company went bankrupt. Integer overflow checking could have saved them.

Animatsover 8 years ago

Now that more languages are checking for integer overflow, it's time for integer overflow exceptions to re-appear in hardware. DEC VAX machines had this, but C didn't use them. With the hardware doing the checking in parallel, there's no performance penalty.If you want wrap-around arithmetic (which is rare) you should have to write something like<pre><code> unsigned short i,j; i = (i + j) % 65536; </code></pre> which the compiler should optimize into a no-check add. This gets you the same answer on all platforms.

评论 #13119933 未加载

lmmover 8 years ago

> That’s because I do not think it is universally a good thing that software should crash when an overflow occurs. Think about the software that runs in your brain. Sometimes you experience bugs. For example, an optical illusion is a fault in your vision software. Would you want to fall dead whenever you encounter an optical illusion? That does not sound entirely reasonable, does it? Moreover, would you want this “fall dead” switch to make all of your brain run at half its best speed?If you lived in a world where hackers crafted optical illusions that made you send all your money to them when you viewed them, you would probably want to go blind or some such when you encountered such an illusion.

评论 #13117152 未加载

sulamover 8 years ago

I found this statement funny:"Moreover, would you want this “fall dead” switch to make all of your brain run at half its best speed?"...because this is actually how our brains work! We are much slower to process and respond to all sorts of stimuli (visual, auditory, conceptual [reading]) when it is contradictory. Think of the feeling you get looking at an Escher sketch.

c0ffeover 8 years ago

After chasing strange bugs when using dynamic languages like PHP and JavaScript that keep running by default when minor errors happen (PHP warnings, or undefined variables in JavaScript), I think its good that Swift priorizes safety rather than speed.

mcguireover 8 years ago

"That’s because I do not think it is universally a good thing that software should crash when an overflow occurs. Think about the software that runs in your brain. Sometimes you experience bugs. For example, an optical illusion is a fault in your vision software. Would you want to fall dead whenever you encounter an optical illusion? That does not sound entirely reasonable, does it? Moreover, would you want this “fall dead” switch to make all of your brain run at half its best speed? In software terms, this means that a mission-critical application could crash because an unimportant function in a secondary routine overflows."That is a ridiculous analogy. What if we replace "optical illusion" with "hallucination"?More importantly, what if there were some sort of middle ground between continuing on as if nothing happened on an error and crashing completely?

nkurzover 8 years ago

I thought it might be interesting to see how this effect changes with the size of the array being summed. How do the relative speeds change when operating out of L1, L3, and memory? Does the lower speed of memory access overwhelm the overhead of the overflow checking?<pre><code> $ swift build --configuration release $ cset proc -s nohz -e .build/release/reduce # count (basic, reduce, unsafe basic, unsafe reduce) 1000 (0.546, 0.661, 0.197, 0.576) 10000 (0.403, 0.598, 0.169, 0.544) 100000 (0.391, 0.595, 0.194, 0.542) 1000000 (0.477, 0.663, 0.294, 0.582) 10000000 (0.507, 0.655, 0.337, 0.608) 100000000 (0.509, 0.655, 0.339, 0.608) 1000000000(0.511, 0.656, 0.345, 0.611) $ swift build --configuration release -Xswiftc -Ounchecked $ cset proc -s nohz -e .build/release/reduce # count (basic, reduce, unsafe basic, unsafe reduce) 1000 (0.309, 0.253, 0.180, 0.226) 10000 (0.195, 0.170, 0.168, 0.170) 100000 (0.217, 0.203, 0.196, 0.201) 1000000 (0.292, 0.326, 0.299, 0.252) 10000000 (0.334, 0.337, 0.333, 0.337) 100000000 (0.339, 0.339, 0.340, 0.339) 1000000000(0.344, 0.344, 0.344, 0.344) </code></pre> Code is from <a href="https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/2016/12/05" rel="nofollow">https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/...</a> with modification to loop over the different array lengths. Numbers are for Skylake at 3.4 GHz with swift-3.0.1-RELEASE-ubuntu16.04. Count is the number of 8B ints in the array being summed. Results shown were truncated by hand --- I wasn't sure how to specify precision from within Swift. The execution with "cset proc -s nohz" was to reduce jitter between runs, but doesn't significantly affect total run time. The anomalously fast result for the L3 sized 'unsafe' 'unchecked' is consistent.

wlesieutreover 8 years ago

Site is down, cached here: <a href="http://webcache.googleusercontent.com/search?q=cache:MYG52quo1u8J:lemire.me/blog/2016/12/06/dont-assume-that-safety-comes-for-free-a-swift-case-study/" rel="nofollow">http://webcache.googleusercontent.com/search?q=cache:MYG52qu...</a>EDIT: It's back up

xenadu02over 8 years ago

I don't get the same numbers he gets. The reduce version is the same speed as the simple for loop. He must have made a mistake somewhere.

dispose13432over 8 years ago

How do languages which aim to replace C (such as Rust) deal with this issue?Now I agree, your average webapp won't see any benefit by removing checks and will see security features by keeping them in, so I'm all for it.But in OSs (or browsers), speed does matter. And there's no way to optimize it (every + or array operation involves an if).Is this just one of the "costs of doing business"?

评论 #13117106 未加载

评论 #13117092 未加载

评论 #13117426 未加载

评论 #13117640 未加载

评论 #13117077 未加载

评论 #13117147 未加载

emodendroketover 8 years ago

A factor of three for addition is probably not really significant in most programs.

acqqover 8 years ago

I see a number of posts that proudly claim (in different forms) "fail early and fast" like it's a good thing to simply do a run-time crash always, and even that the article author is "misguided and inflammatory."I don't agree with both claims.Regarding the first, as an example where the crash is definitely not the solution, see the planteen's post:<a href="https://news.ycombinator.com/item?id=13117170" rel="nofollow">https://news.ycombinator.com/item?id=13117170</a>Or consider that once you use the computers to calculate the real life stuff (like the movement of your car, or even the spaceship 50 million kilometers away) the worst thing you can do is introduce the "fatal discontinuities" in the processing.Regarding the second, allow me to just roll my eyes. The politics is not allowed this week on HN, but the political approaches start to be used automatically. Please just write which his claim is wrong. Labeling is destructive.

评论 #13117423 未加载

评论 #13118180 未加载

评论 #13117914 未加载