My own favourite random() in the wild bug is one that I've come across many many times:<p><pre><code> /* Generate a random number 0...5 inclusive */
int r = random() % 6;
</code></pre>
The problem is that this results in a bias towards certain values, because the random() return space is probably not a whole multiple of the number you are modding. It's easier if you think what would happen if RAND_MAX were "10". Then the results 0,1,2,3,4 each have two opportunities of being selected for r, but "5" only has one. So 5 is half as likely as any other number. Terrible!<p>Using float/double random()'s don't always help either, because floating point multiplication has non-uniform errors. Instead, what you really need is:<p><pre><code> int random_n(int top) {
if (top <= 0) {
return -1;
}
while(1) {
int r = random(top);
if (r < (RAND_MAX - (RAND_MAX % top))) {
return r;
}
}
return -1;
}
</code></pre>
Although it's better to replace random() itself with something that uses real entropy.
This sort of code is one of the motivations behind improving the random number generation facilities available in C++. If you're using C++ or you're using C and have the option to link with C++, I strongly recommend looking at the random number library introduced in C++11 [1]. In addition to letting you specify a statistical distribution (including a proper uniform distribution for both integral and floating-point types), it lets you choose between various PRNG engines with various trade-offs. It also provides a way to source hardware entropy with which to seed an engine, and it's all pretty easy to use.<p>*[1] en.cppreference.com/w/cpp/numeric
I would've appreciated if you'd annotated each snippet with the source so as to make it easier for us to find the program it came from. One interesting question to ask next would be, what do these programs do with their random numbers? And so, does the quality of the stream or the non-repeatability of it matter at all?
My favourite bit is: "Take 16 bytes of random data. No, wait, make that 15 bytes. Then hash it to four bytes to really squeeze the entropy in. Then seed."<p>Good article
With all the comments author makes about nonstandard and crazy behaviour, he actually misses some practical solutions and makes fun of them.<p>"The one operation that was not observed was substracting the pid from the time. More research into this subject is warranted."<p>It's actually simple (even if still not effective on pid wraparound) - pid numbers grow, at a rate of at least 1 per program execution. Time grows at around 1 second per second. If you substituted pid from time, there's a good chance you would get the same seed by running the app twice in a row.<p>So it's added instead, so that it always grows. And we pretend the wraparound happens very rarely.<p>Broken behaviour? Sure. Practical solution that works for 99% cases where non-critical randomness is required? Definitely.
For cryptography, there's really no reason to use rand(), mt_rand(), or the other insecure variants. No excuse, I should say.<p>Just use urandom. Or getentropy() if your OS supports it.<p>If you're not using it for cryptographic purposes, then I don't see why it matters. :)
> Perhaps there is a reason why software like Lua, Python, and Ruby all include their own implementation of a Mersenne Twister.<p>As far as I know, Lua does not include Mersenne Twister. math.random() is just C rand().