Impact of Undefined Behavior on Performance

47 点作者 SerCe将近 4 年前

14 条评论

CJefferson将近 4 年前

The compiler could still sub in the equation for unsigned integers, the equation is (mostly) the same.Not discussed (that I can see), the real problem is the loop variable -- because the loop ends with ' <= n', which means if n is the largest unsigned integer, this is an infinite loop. One easy thing UB is useful for is proving these types of loop terminate (as overflowing would be UB).

评论 #27781462 未加载

评论 #27781481 未加载

steerablesafe将近 4 年前

One nice thing about undefined behavior is that it allows any behavior to be implemented, including diagnostics. So if your compiler has an option to do that (-fsanitize=undefined) then you can make use of that to catch bugs.If you have defined wrapping/saturating semantics then you don't get this option, even if overflow is a bug in your program.AFAIK rust also panics on overflow in debug builds, so it's already decided that overflow is semantically a bug by default. So in release mode as it is already a bug then you potentially can't reason about the further behavior of the program. The compiler to assume that overflow doesn't happen and using that to optimize your program does not worsen the situation too much.

评论 #27782491 未加载

评论 #27787204 未加载

评论 #27782648 未加载

superjan将近 4 年前

Interesting point, but a better post about compilers exploiting this type of ub is ryg’s:<a href="https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759de5a7" rel="nofollow">https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...</a>

the8472将近 4 年前

Safe rust can eliminate the loop just fine if you use fold() or an exclusive range instead of an inclusive one. This is because in release mode it has wrapping semantics for unsigned integers. In debug mode you would get a panic instead for too large inputs. So the behavior is defined, albeit still a bit of a pitfall. But you can enable overflow checks in release mode too if desired.<a href="https://rust.godbolt.org/z/1sK7GEr9E" rel="nofollow">https://rust.godbolt.org/z/1sK7GEr9E</a>

imoverclocked将近 4 年前

> or _when_ undefined behavior is our friend. (emphasis mine)I think this is the key point a lot of responses are missing. It's not that undefined behavior is innately good or bad. It's that it leaves wiggle room that can potentially be exploited for good while being too rigid may not always be the best long-term strategy.I used to think of code as something that should live forever and compile/run for all time without change. Over time, I've come to realize that just as spoken languages evolve, so do our programming languages. To run/understand code written 2 decades ago, we often have to use compilers/interpreters from that era. I doubt that will change even with extremely rigid language specifications.

lordnacho将近 4 年前

Doesn't UB also mean whatever it does might change?This summation thing seems clever but in a large code base you probably don't want to have a bunch of clever UBs.

评论 #27782066 未加载

fay59将近 4 年前

That's not a good take. Compilers love UB because it signals unreachable code, and unreachable code can be removed and/or its surroundings can be simplified. If UB was instead defined to trap, most optimizations that are made possible by UB would still be possible, but without the footgun part.Specifically with this example, if the rust community cared, it could implement pattern detection for this operation and reduce it to a constant-time operation too; the reality is just that nobody cares and this optimization was probably only added to make Clang look better on a specific benchmark.The other problem with UB is how capricious the specification is. There is no contemporary reason for why signed integer arithmetic is UB and unsigned integer arithmetic isn't. It's just a wrinkle from the past that somehow, some people prefer to praise than to fix.

评论 #27781326 未加载

评论 #27783502 未加载

eptcyka将近 4 年前

Cool, one version of a compiler will do the cool optimization that makes it go fast. But since this is undefined behavior, it's also unspecified behavior, and one can't expect it to be reliable. Maybe in practice this doesn't matter, but theoretically there's nothing (besides perf regression testing) stopping the next version of the same compiler from not doing this. Furthermore, you'll be bound to a specific compiler if you care about performance enough to rely on specific behavior for UB, one can't reasonably expect all compilers to do the same thing when they are allowed to do UB.

bxparks将近 4 年前

The problem with undefined behavior is that there are about 200 of them in the C language spec, and I can remember only 3 or 4 at any given time (signed integer overflow, dereferencing NULL, out-of-bounds array access). And I can't remember which ones are "undefined behavior", and which ones are "implementation defined". Sometimes I remember more when I use it a lot, then I forget most of them when I shift to projects using other languages. This means my ability to write correct, non-trivial C code is basically zero.

tirrex将近 4 年前

> And how much impact does it have on runtime performance? Well even computing a sum for a relative small number 1000, according to cppbench, the signed version is 430X faster.Although you can find these kind of examples for a few lines of code snippet, the question is what is the impact on overall program? Nowadays, I guess it has no impact for almost all programs. Because memory access patterns, system call overhead, operating system interaction etc. have much more impact on overall performance compared to optimizations enabled by undefined behaviors.

评论 #27781063 未加载

评论 #27781155 未加载

评论 #27781018 未加载

vkazanov将近 4 年前

So, in a narrow, simplistic case signed integer overflow undef behaviour made it possible to replace a loop with a constant. Hooooray!Pretending that integer overflows don't happen because of undef behaviour is... Funny. Integer overflow do happen a lot, both with signed and unsigned ints. Compilers should only apply radical optimisations in this situation if they can prove an overflow will never happen. Otherwise, they should not touch the loop.EDIT: typos, rephrasing for clarity

评论 #27781185 未加载

xorvoid将近 4 年前

Is this really an optimization that we want/need a compiler to do? I kind of get it for really high-level languages like Haskell, but for low-level languages like C/C++ it feels like a silly toy example. The abstract machine is so low-level that many big optimizations can’t be done. Instead the compiler can only do little O(1) optimizations. And you know what, I’m perfectly fine with that. C/C++ trades it all in order to give the programmer significant control over how the code works. If I want to use the summation formula, I can just write that myself. Honestly the thing I hate about the standards bodies is that they try to have it both ways so their compiler developer buds can implement their favorite (imho dubious) optimizations. But as a programmer, what I really want is well-defined, predictable constructs. Worrying about UB while also trying to design an algorithm, while also designing good data structures, while also designing for cache hierarchy , while also optimizing, while also thinking about cache coherency, while also thinking about paging and tlb behaviors, while also.. .. .. It’s all already too complicated before having to worry about what UB the compiler may try to exploit next week. I have a very strong suspicion that almost any non-trivial C/C++ program has UB somewhere. If so, then something is seriously wrong with this whole concept when useful and practical programs written in C are not technically defined/valid C programs!

renox将近 4 年前

I think that Zig has undefined behaviour for + overflow both both signed and unsigned values. It also has an 'add_modulo' operator.

tomp将近 4 年前

Hm… so Rust compiles code as is, whereas C “overly smart” compiler mangles correct code into faster, less correct version.If Rust is too slow (as measured by the profiler… premature optimisation etc.), you can always take another look at the algorithm and optimise it manually, and correctly.On the other hand, if C is wrong… let’s hope you manage to isolate the fault to this function (a big if!)… start up the debugger… the assembly will likely look correct, because the compiler won’t optimise so aggressively in a debug build! Good luck debugging <3

评论 #27781951 未加载

评论 #27781534 未加载