<i>It remains unclear what behaviour compilers currently provide (or should provide) for this.</i><p>It might be nice if future surveys explicitly asked a followup question "Regardless of the standard or behavior of existing compilers, is there one of these answers that is the 'obviously correct' manner in which compilers should behave? Which one?"<p>If practically all users believe that the same answer is 'obviously correct', compiler writes might want to take this into account when deciding which behavior to implement.<p><pre><code> For MSVC, one respondent said:
"I am aware of a significant divergence between the LLVM
community and MSVC here; in general LLVM uses "undefined
behaviour" to mean "we can miscompile the program and get
better benchmarks", whereas MSVC regards "undefined
behaviour" as "we might have a security vulnerability so
this is a compile error / build break". First, there is
reading an uninitialized variable (i.e. something which
does not necessarily have a memory location); that should
always be a compile error. Period. Second, there is reading
a partially initialised struct (i.e. reading some memory
whose contents are only partly defined). That should give a
compile error/warning or static analysis warning if
detectable. If not detectable it should give the actual
contents of the memory (be stable). I am strongly with the
MSVC folks on this one - if the compiler can tell at
compile time that anything is undefined then it should
error out. Security problems are a real problem for the
whole industry and should not be included deliberately by
compilers."
</code></pre>
I'm much less familiar with MSVC than the alternatives, but this is a refreshing approach. Yes, give me a mode that refuses to silently rewrite undefined behavior. Is MSVC possibly able to take this approach because it isn't trying to be compliant to modern C standards? Does it actually reduce the ability to apply useful optimizations? Or just a difference in philosophy?
There is a relevant article <a href="http://blog.regehr.org/archives/1180" rel="nofollow">http://blog.regehr.org/archives/1180</a> and discussion <a href="https://news.ycombinator.com/item?id=8233484" rel="nofollow">https://news.ycombinator.com/item?id=8233484</a> about what programmers really think C should be like (i.e. the "portable assembler" it was designed to be originally); it incidentally also shows what they think a sane machine architecture should be like.
Question 2 is :<p>Is reading an uninitialised variable or struct member (with a current mainstream compiler):<p>(This might either be due to a bug or be intentional, e.g. when copying a partially initialised struct, or to output, hash, or set some bits of a value that may have been partially initialised.)<p>a) undefined behaviour (meaning that the compiler is free to arbitrarily miscompile the program, with or without a warning) : 128 (43%)<p>b) ( * ) going to make the result of any expression involving that value unpredictable : 41 (13%)<p>c) ( * ) going to give an arbitrary and unstable value (maybe with a different value if you read again) : 20 ( 6%)<p>d) ( * ) going to give an arbitrary but stable value (with the same value if you read again) : 102 (34%)<p>e) don't know : 3 ( 1%)<p>f) I don't know what the question is asking : 2 ( 0%)<p>--------------------<p>I know of one datastructure (a sparse set of integers from 1-n) which relies on this behavior: <a href="http://research.swtch.com/sparse" rel="nofollow">http://research.swtch.com/sparse</a> . I always thought it was a neat trick. However, from the article is seems that may NOT give stable values to uninitialized members. Which may make that data structure behave strangly or cause the program to miscompile.
A lot of C programmers make valid assumptions based on their system architecture and compiler. Is there any point trying to unify their obviously different practices, while at the same time ignore the Standard? No.
As noted in the article, a summary of this document (about one-third the length) from the same authors is available at <a href="http://www.cl.cam.ac.uk/~pes20/cerberus/notes51-2015-06-21-survey-short.html" rel="nofollow">http://www.cl.cam.ac.uk/~pes20/cerberus/notes51-2015-06-21-s...</a>
Note that the C99 (I'm not up on C11 yet) standard specifically allows type-punning through the use of unions (and disallows it in essentially all other cases except when one type is a character type).<p>Also, there seems to be some confusion about storing and loading pointers, when the standard speaks to this as well; roughly: a pointer which is converted to a "large enough" integer type will point to the same object when converted back. It is permissible for an implementation to not provide a large enough integer type, but excepting that, the behavior is well defined.
As far as it's possible compilers should check as much as possible when false assumptions are made.<p>Is it possible to build a language, which would reduce the number of false assumptions?