My C code works with -O3 but not with -O0

177 pointsby mulle_natover 5 years ago

19 comments

mokusover 5 years ago

The title is actually wrong - the -O0 version is correct, the -O3 version is not (despite giving the output the author expected).Casting the value to double ends up converting the long value 0x7fffffffffffffff to the nearest double value: 0x8000000000000000. As the -O0 version CORRECTLY reports, this does not round-trip back to the same value in the "long" type. Many other values, though not all, down to about 1/1024 of that value (1 / 2^(63-53)) will also fail to round-trip for similar reasons.Unless my coffee-deficient brain is missing something at the moment, it should be the case that any integer with 53 bits or fewer between the first and last 1 bit (inclusive) will roundtrip cleanly. Any other integer will not.Edit: fixed a typo above, and coded up the idea I expected to work and ran it through quickcheck for a few min, and this version seems to be correct ('int' return rather than bool is just because haskell's ffi doesn't natively support C99 bool):<pre><code> #include <limits.h> int fits(long x) { if (x == LONG_MIN) return 0; unsigned long ux = x < 0 ? -x : x; while (ux > 0x1fffffffffffffUL && !(ux & 1)) { ux /= 2; } return ux <= 0x1fffffffffffffUL; }</code></pre>

评论 #22065994 未加载

评论 #22065703 未加载

评论 #22065599 未加载

评论 #22065634 未加载

评论 #22065665 未加载

0xff00ffeeover 5 years ago

Hoo boy. This is what happens the first time C programmers start to work with floating point and don't know the fundamentals.When you work with floating point, you need to remember you work with a tolerance to epsilon for comparisons because you are rounding to 1/n^2 precision and different floating point units perform the conversion in different ways.You must abandon the idea of '==' for floats.This is why his code is unpredictable, because you cannot guarantee the conversion of any integer to and from float is the same number. Period. The LSBs of the mantissa can and do change, which is why we mask to a precision or use signal-to-noise comparisons when evaluation bit drift between FP computations.He has the first part correct, < and > are your friend with FP. But to get past the '==' hurdle, he needs to define his tolerance, the code should be something like:if (fabs(f1 - f2) > TOLERANCE) ... fits = true.I was irked by his arrogance when he asks, "Intel CPUs have a history of bugs. Did I hit one of those?" First, learn about floating point, then, work on an FPUnit team for 10 years, and even then, don't assume you're smarter than a team of floating point architects, you're not.

评论 #22068013 未加载

评论 #22069176 未加载

评论 #22067502 未加载

评论 #22068755 未加载

评论 #22069203 未加载

评论 #22067954 未加载

hannobover 5 years ago

This may be an instance of "you should really know the gcc/clang sanitizers and use them to test your code":clang test.c -O0 -fsanitize=undefined./a.out[...]test.c:17:12: runtime error: 9.22337e+18 is outside the range of representable values of type 'long'Interestingly gcc doesn't throw that warning.

评论 #22063877 未加载

评论 #22066602 未加载

评论 #22063889 未加载

评论 #22064694 未加载

评论 #22064121 未加载

ThreeFxover 5 years ago

A precise integer value is only guaranteed to be representable losslessly in a double if it is up to `64 - 1 (sign) - 11 (exponent) = 52` bits in magnitude.This should be fairly obvious with knowledge about how floating point numbers are represented internally IMO.Edit: Be more precise about what can be represented.

评论 #22064261 未加载

评论 #22063753 未加载

评论 #22064250 未加载

评论 #22064322 未加载

dfrankeover 5 years ago

Clear your floating point exception register by calling feclearexcept(FE_ALL_EXCEPT). Convert to long by calling lrint(rint(x)). Then check your exception register using fetestexcept(). FE_INEXACT will indicate that the input wasn't an integer, and FE_INVALID will indicate that the result doesn't fit in a long.Edit: check for me whether just calling lrint(x) works. The manpage doesn't specify that lrint() will set FE_INEXACT, but it seems weird to me that it wouldn't.

评论 #22065242 未加载

评论 #22066569 未加载

评论 #22065168 未加载

g82918over 5 years ago

To save some time for new readers, the author is unfamiliar with floating point representation and thinks that a double precision number since it is 64 bits can hold any 64 bit integer and is somewhat confused as to what an xmm register can hold(they believe that it has 128 bits of precision instead of being able to hold 2 64-bit doubles, or 4 32-bit singles). They attempt to find the issue a few ways. The correct solution is not to convert any integer larger than 2^53 in absolute value since only integers that large can be successfully converted to double and back( aside from a few others that exist sparsely).

pjc50over 5 years ago

> When something very basic goes wrong, I have this hierarchy of potential culprits:I don't know if this is supposed to be a joke or part of the setup for an explanatory post about undefined behaviour, but that list is in exactly the wrong order.

评论 #22064244 未加载

heftigover 5 years ago

SSE xmm registers might be 128 bits wide, but the precision is still 64 bits. The additional (high) bits are zeroed out.What you're seeing is not excess precision due to wide registers but excess precision due to optimization and constant propagation, which means GCC calculates a fast path for (argc == 1) that doesn't round correctly and ends up with "it fits".Interestingly it does optimize to the correct "doesn't fit" with -mfpmath=387 -fexcess-precision=standard, so I guess this is a bug in how GCC treats SSE math. The sanitizer (-fsanitize=float-cast-overflow) also notices the problem.

yurikoover 5 years ago

Based on my experience, this title is a strong hint that some undefined behaviour is triggered.

mojubaover 5 years ago

<pre><code> if( ! fits) </code></pre> Why this (constently) terrible formatting though? Never seen anyone using this style.

评论 #22064080 未加载

评论 #22065278 未加载

mulle_natover 5 years ago

With the help of your comments, I could now write the conclusion to my article. In a nutshell this is the solution:<pre><code> #include <math.h> #include <fenv.h> int fits_long( double d) { long l_val; double d_val; // may be needed ? // #pragma STDC FENV_ACCESS ON feclearexcept( FE_INVALID); l_val = lrint( d); d_val = (double) l_val; if( fetestexcept( FE_INVALID)) return( 0); return( d_val == d); } </code></pre> The article explains it in more detail. Thanks for the help.

NullPrefixover 5 years ago

The frst rule of floating point comparison is you do not compare them for equality, but instead calculate the difference and check if the difference is less than epsilon.

评论 #22068178 未加载

ginkoover 5 years ago

>I am still looking for a better way to check, if a double will convert cleanly to an integer of the same size or not.I'd say the cleanest would be to decode exponent and mantissa, check if the exponent is within the 64-bit limit of long, then check if there's any bits set below the decimal point. (+plus some extra care for two's complement negative numbers)The problem with this is of course that this would be platform dependent.

评论 #22063978 未加载

aptaover 5 years ago

This is why using safe languages is important. Even frequent users of C and C++ end up making mistakes that are difficult to track down.

评论 #22076805 未加载

Aardwolfover 5 years ago

The most surprising thing for me out of this is that casting a high positive integer to double will output the nearest double which could be higher, not the highest one smaller than or equal to the integer value.Is there a way to get the largest double smaller or equal than some positive integer?

correct_horseover 5 years ago

> When something very basic goes wrong, I have this hierarchy of potential culprits: the compiler buggy hardware OS vendor last and least me, because I don’t make mistakes :)I really dislike the arrogant programmer trope. Can we all stop?

CGamesPlayover 5 years ago

How did you decide that "the method works for LONG_MIN"? Did the method return the expected output of false? Because it really seems like the code is working correctly on `-O0` and incorrectly on `-O3`...

评论 #22065277 未加载

adammunichover 5 years ago

I had something similar happen but with GCC generating an internal compiler error and just plain failing. Still haven't figured out why.

syockitover 5 years ago

I'd say just put volatile and be done with it. Now your -O3 will also break, but at least it's consistent with -O0 :p