For those wondering how this works:<p>In C, like most languages, all binary operators on numbers – including arithmetic as well as comparisons – 'natively' require two operands of the same type. When you pass operands of different types, the language usually converts one of the operands to the other's type. (Sometimes it converts both operands to a different type instead.)<p>In this case, without the cast, the comparison will be performed by converting the `int` to `size_t`. The dangerous case is when `num` is negative. Since `size_t` is unsigned, the conversion will turn negative numbers into very high positive numbers, so `num > limit` will be true, and the function will safely reject the allocation. On the other hand, with `num > (int)limit`, no conversion occurs; the comparison is performed between two signed integers, and `num > limit` will be false if `num` is negative.<p>It would be better to be more explicit. Perhaps use two comparisons (the compiler will optimize them into one if possible):<p><pre><code> if (num < 0 || num > limit)
</code></pre>
Or, better yet, make `allocatebufs` take `size_t` instead of `int` in the first place. Taking `int` is risky even without the comparison bug: if someone calls the function with a `size_t` on a platform where `size_t` is larger than `int` (i.e. 64-bit platforms), the upper bits will be truncated. For instance, if the passed-in value is 0x100000001, it will be truncated to 0x1, causing allocatebufs to return a much smaller buffer than expected.<p>If you're wondering about the exact rules for what types get converted to what, they're very complicated, unnecessarily so. It mostly boils down to "larger type wins; in case of a tie, unsigned wins over signed". But there are exceptions. Here's the spec language:<p>- <a href="http://c0x.coding-guidelines.com/6.3.1.8.html" rel="nofollow">http://c0x.coding-guidelines.com/6.3.1.8.html</a><p>- <a href="http://c0x.coding-guidelines.com/6.3.1.1.html" rel="nofollow">http://c0x.coding-guidelines.com/6.3.1.1.html</a>
In my opinion, the interactions between unsigned integers and signed integers are too hard for a programmer like myself to handle correctly in practice, especially being consistently vigilant all the time.<p>I admit that I didn't understand how the code in the article worked (i.e. why the unsigned check was correct and why the signed check was incorrect) until I read the answer in this comment thread.<p>I like Java's design where all integer types (except char) are signed, and all arithmetic is done in signed mode. It simplifies the mental model and reduces the opportunities for logic errors.<p>Signed integers are definitely useful sometimes (e.g. difference of two unsigned indexes). A wide-enough signed type (e.g. int32) can hold all values of a narrower unsigned type (e.g. uint16). So it is possible to design a language without unsigned integer types without losing functionality.
This was potentially in response to this event, posted to HN earlier today: <a href="https://news.ycombinator.com/item?id=20874470" rel="nofollow">https://news.ycombinator.com/item?id=20874470</a> (a Spectre Linux kernel patch rendered ineffective by LTS maintainers in an attempt to silence a compiler warning).<p>The scenario described in this submission was the first thing that came to my mind upon reading about that patch.
To me, as someone that doesn't use C in their day to day life, C appears to be a security nightmare. I cannot imagine using it in a production environment that takes any amount of user input.
I’m quite sure this is a response to <a href="https://news.ycombinator.com/item?id=20874470" rel="nofollow">https://news.ycombinator.com/item?id=20874470</a> submitted earlier today. I don’t believe there was any malicious intent there, but it’s yet another safety issue in C that anybody tells me they are aware of, that it would never happen to them and that memory safety issues are things that only happen to other less experienced programmers.<p>I must only know the cream of the crop of C programmers /s
As someone who doesn't understand the language's nuances - what does adding a cast to a value being assigned in the function open? Will it return NULL more often?
I feel that it says something about the problem area when over forty comments in the thread are dedicated to the discussion of how exactly the five lines of code work.
The code is 'totally safe and secure' in a nasty, obfuscated way. It takes too much language / system knowledge to tell that negative numbers are covered. Plus, here isn't an extra block whose purpose is confusing that is responsible for negative numbers. Instead the behavior is hidden as part of another block that has a simple face-value interpretation. This is (1) fragile but (2) not self-aware that it's fragile. If you don't refactor to avoid fragility at least leave a comment. Or how about unit tests? If you don't question the motives of the author you should at least question their coding ability.
Obligatory mention of early attempt to really understand and address subversion as a threat plus a follow-up by pioneers that built on that:<p><a href="http://seclab.cs.ucdavis.edu/projects/history/papers/myer80.pdf" rel="nofollow">http://seclab.cs.ucdavis.edu/projects/history/papers/myer80....</a><p><a href="https://apps.dtic.mil/dtic/tr/fulltext/u2/a435312.pdf" rel="nofollow">https://apps.dtic.mil/dtic/tr/fulltext/u2/a435312.pdf</a><p>The current state-of-the-art in both formal methods, automated analyses, and test generators can get us pretty close to their goals with such designs, too. Way less cost-prohibitive than it used to be. Although, high-assurance security today has moved far away from security kernels they advocated to focus on securing hardware/software and distributed systems in ways that allow expressing more flexible and useful policies. The design and assurance methods of the past still work, though. Stuff like Rust, SPARK Ada, RV-Match, Why3, and CompCert handle more code-level problems than older methods, too.
Or just introduce a format string vulnerability by removing the format specifier from printf et all functions ... or an off by one or double free bug if you want to get more genuine
How about this in stdlib.h:<p><pre><code> void *malloc(signed long long int untrusted_size) {
size_t size;
if (untrusted_size < 0) {
fprintf(stderr, "warning: exploit "
"detected. Click <ok> to cancel");
}
size = (size_t)untrusted_size;
etc.
}
</code></pre>
Or better yet redefine malloc with a wrapper and include the __line__ and __file__ in the exploit error.<p>Edit: markdown
Is this bug exposed only on systems with 1 byte words? I'm struggling a bit to understand the exposed bug and, while I understand the whole point is that there are lots of issues of this kind that can hide in code pretty innocently it'd be nice to actually comprehend this one.
The point is not really about the specific mechanism shown in the post, but more about plausible deniability and how a malicious actor might introduce a vulnerability like this in a way that makes their innocence seem unimpeachable.