TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Undefined Behavior Is Really Undefined

94 pointsby erwanover 6 years ago

14 comments

kibwenover 6 years ago
Upvoted for the signed integer overflow example. I&#x27;ll admit that I actually don&#x27;t know the most idiomatic, bulletproof way of testing for signed overflow in C; if you google &quot;how to test signed integer overflow in C&quot;, the very first result is essentially equivalent to the buggy example in the blog post ( <a href="https:&#x2F;&#x2F;www.geeksforgeeks.org&#x2F;check-for-integer-overflow&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.geeksforgeeks.org&#x2F;check-for-integer-overflow&#x2F;</a> ), and I&#x27;m not keen to repeat the legendary case of signed overflow within the PHP interpreter: <a href="https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20120412194929&#x2F;http:&#x2F;&#x2F;use.perl.org&#x2F;use.perl.org&#x2F;_Aristotle&#x2F;journal&#x2F;33448.html" rel="nofollow">https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20120412194929&#x2F;http:&#x2F;&#x2F;use.perl.o...</a>
评论 #18583315 未加载
评论 #18582616 未加载
评论 #18582746 未加载
评论 #18583668 未加载
评论 #18583812 未加载
评论 #18582865 未加载
评论 #18583223 未加载
评论 #18594892 未加载
评论 #18583930 未加载
AndyKelleyover 6 years ago
In Zig we embrace undefined behavior. It&#x27;s what allows tools to catch mistakes, and it&#x27;s what arms the optimizer with the assumptions it needs to be effective.<p>For example, not only is it undefined behavior to overflow on signed integer addition, in Zig it&#x27;s also undefined behavior to overflow on unsigned integer addition. If you want wrapping integer addition, you have to use the wrapping integer addition operator, which is defined to wraparound on overflow.<p>Here&#x27;s the trick though - Zig catches most kinds of undefined behavior before they have a chance to cause problems in release builds. Some undefined behavior is caught at compile time, and otherwise most undefined behavior is caught at runtime, in debug builds. And finally, if you are not confident in the level of testing your software has undergone, you can make a &quot;release-safe&quot; build, which has optimizations on, but includes undefined behavior checks and will crash (or invoke user-defined panic function) rather than invoke undefined behavior.<p>You can see some examples of this here: <a href="https:&#x2F;&#x2F;ziglang.org&#x2F;documentation&#x2F;master&#x2F;#Undefined-Behavior" rel="nofollow">https:&#x2F;&#x2F;ziglang.org&#x2F;documentation&#x2F;master&#x2F;#Undefined-Behavior</a>
评论 #18583545 未加载
nayukiover 6 years ago
I agree with everything in the article - the example of non-intuitive effects of the strict aliasing rule, a tricky integer overflow example, and the unpopular plea to switch away from C&#x2F;C++.<p>When I write C and C++ code, I try to make my logic portable and standards-compliant so that it will work on all platforms. So instead of assuming int is 32 bits, I am only allowed to assume that int is at least 16 bits wide. I assume that sizeof(char) could equal sizeof(int) and both could be 64 bits. I avoid bitwise manipulation on negative numbers, because they&#x27;re not guaranteed to be two&#x27;s complement. Keeping all of these pessimistic assumptions in mind while I code is a mental burden that I don&#x27;t experience in other languages.<p>Regarding integer promotions, here is one tricky situation I reasoned about and asked in <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;39964651&#x2F;is-masking-before-unsigned-left-shift-in-c-c-too-paranoid" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;39964651&#x2F;is-masking-befo...</a> . Suppose you want to compute:<p><pre><code> uint32_t a = UINT32_C(0xFFFFFFFF); uint32_t b = a &lt;&lt; 31; b should be 0x80000000 </code></pre> Looks innocent, eh? Left-shifting an unsigned integer will discard the top bits and never cause undefined behavior. Except, this reasoning can be wrong on some platforms. Suppose:<p><pre><code> typedef unsigned short uint32_t; typedef int int48_t; </code></pre> Now (uint32_t)a → (unsigned short)a → (int)a → (int48_t)a, due to typedefs and integer promotion. But because a is a signed integer, it is undefined behavior to shift 1&#x27;s into the sign bit. Kaboom.
评论 #18583343 未加载
评论 #18583011 未加载
评论 #18594954 未加载
blue_pancakeover 6 years ago
Undefined behavior gets a bad rap, but it&#x27;s not <i>always</i> evil. Compilers and executables would be a lot slower if they had to account for these cases.<p>If you&#x27;re writing serious C, you should be using tools like valgrind on debug-mode executables to make sure you aren&#x27;t relying on undefined behavior. The tools are there. It&#x27;s just not something a lot of people do.
评论 #18582784 未加载
vyodaikenover 6 years ago
Just a reminder that there are ZERO published studies showing that these UB &quot;optimizations&quot; have significant value for any real programs. They impose a bizarre notion of C semantics that is not compatible with the language design. A good critique, for example, of the UB alias behavior can be found in Brian Kernhighans article on Pascal ( <a href="http:&#x2F;&#x2F;www.cs.virginia.edu&#x2F;~evans&#x2F;cs655-S00&#x2F;readings&#x2F;bwk-on-pascal.html" rel="nofollow">http:&#x2F;&#x2F;www.cs.virginia.edu&#x2F;~evans&#x2F;cs655-S00&#x2F;readings&#x2F;bwk-on-...</a> ). The Standards authors have made the exact same error but in an ad hoc hacked up manner.<p>Just use the flags that Linux has forced on the compiler developers in order to be able to make use of C or else give up on writing correct code. <a href="http:&#x2F;&#x2F;www.yodaiken.com&#x2F;2018&#x2F;11&#x2F;17&#x2F;standard-c-is-more-fun-than-ordinary-c&#x2F;" rel="nofollow">http:&#x2F;&#x2F;www.yodaiken.com&#x2F;2018&#x2F;11&#x2F;17&#x2F;standard-c-is-more-fun-th...</a>
评论 #18583756 未加载
评论 #18583851 未加载
评论 #18583537 未加载
ajninover 6 years ago
Is there a compiler flag that can be set to print a warning about all undefined behaviour? I&#x27;m no a C dev but this UB business seems like a little cat and mouse game between the developer and the compiler which tries to find &quot;tricks&quot; to avoid doing stuff, which seems backwards.
评论 #18583247 未加载
评论 #18583307 未加载
评论 #18583404 未加载
评论 #18583210 未加载
saagarjhaover 6 years ago
There’s really no need to pass -O9 to GCC. Anything over -O3 should become -O3 anyways.
评论 #18589435 未加载
评论 #18582690 未加载
eridiusover 6 years ago
The penultimate example tripped me up. I was under the impression that arithmetic conversions for binary operators only happened if the types of the operands were different. But reading the standard, and then actually experimenting with clang and __auto_type, does confirm that if the operands can be converted to int or unsigned int, then they will be (and that it will convert to int if int can represent all the values). That&#x27;s really kind of nasty given the lack of wrapping overflow on signed integers.<p>This actually makes me wonder, if I <i>do</i> want wrapping overflow on signed integers in C, how do I request it? Is there some compiler builtin or stdlib function to say &quot;please add&#x2F;multiply&#x2F;whatever these signed integers with overflow&quot;?
评论 #18585370 未加载
tempodoxover 6 years ago
Articles like this are important. The superficial simplicity of C can be misleading.
ThisIs_MyNameover 6 years ago
More of this: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=yG1OZ69H_-o" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=yG1OZ69H_-o</a>
sn41over 6 years ago
&gt; The C standard specifies that values “cannot” be accessed through pointers that do not match the effective type of the value<p>I think this is used in &quot;type punning&quot; in union structures. This is a related comment by Linus Torvalds on the kernel list:<p><a href="https:&#x2F;&#x2F;lkml.org&#x2F;lkml&#x2F;2018&#x2F;6&#x2F;5&#x2F;769" rel="nofollow">https:&#x2F;&#x2F;lkml.org&#x2F;lkml&#x2F;2018&#x2F;6&#x2F;5&#x2F;769</a>
jancsikaover 6 years ago
&gt; The C standard specifies that values “cannot” be accessed through pointers that do not match the effective type of the value<p>Yet they can be and often are by using the union trick.<p>To decide whether to use the union trick requires a discussion of a program&#x27;s desired portability which-- while usually desirable-- is a separate issue from undefined behavior.
评论 #18583470 未加载
评论 #18582833 未加载
twtwover 6 years ago
Why not just define these things? Make -fwrapv, -fno-strict-aliasing the standard?
评论 #18583300 未加载
评论 #18583873 未加载
评论 #18583227 未加载
liftbigweightsover 6 years ago
Actually undefined behavior is defined. It is defined as undefined.
评论 #18583327 未加载