TechEcho

7 comments

colmmaccover 10 years ago

My own favourite random() in the wild bug is one that I've come across many many times:<pre><code> /* Generate a random number 0...5 inclusive */ int r = random() % 6; </code></pre> The problem is that this results in a bias towards certain values, because the random() return space is probably not a whole multiple of the number you are modding. It's easier if you think what would happen if RAND_MAX were "10". Then the results 0,1,2,3,4 each have two opportunities of being selected for r, but "5" only has one. So 5 is half as likely as any other number. Terrible!Using float/double random()'s don't always help either, because floating point multiplication has non-uniform errors. Instead, what you really need is:<pre><code> int random_n(int top) { if (top <= 0) { return -1; } while(1) { int r = random(top); if (r < (RAND_MAX - (RAND_MAX % top))) { return r; } } return -1; } </code></pre> Although it's better to replace random() itself with something that uses real entropy.

评论 #8725985 未加载

评论 #8725549 未加载

sjolsenover 10 years ago

This sort of code is one of the motivations behind improving the random number generation facilities available in C++. If you're using C++ or you're using C and have the option to link with C++, I strongly recommend looking at the random number library introduced in C++11 [1]. In addition to letting you specify a statistical distribution (including a proper uniform distribution for both integral and floating-point types), it lets you choose between various PRNG engines with various trade-offs. It also provides a way to source hardware entropy with which to seed an engine, and it's all pretty easy to use.*[1] en.cppreference.com/w/cpp/numeric

评论 #8727707 未加载

评论 #8726077 未加载

评论 #8725948 未加载

clarryover 10 years ago

I would've appreciated if you'd annotated each snippet with the source so as to make it easier for us to find the program it came from. One interesting question to ask next would be, what do these programs do with their random numbers? And so, does the quality of the stream or the non-repeatability of it matter at all?

评论 #8725368 未加载

mijoharasover 10 years ago

My favourite bit is: "Take 16 bytes of random data. No, wait, make that 15 bytes. Then hash it to four bytes to really squeeze the entropy in. Then seed."Good article

viraptorover 10 years ago

With all the comments author makes about nonstandard and crazy behaviour, he actually misses some practical solutions and makes fun of them."The one operation that was not observed was substracting the pid from the time. More research into this subject is warranted."It's actually simple (even if still not effective on pid wraparound) - pid numbers grow, at a rate of at least 1 per program execution. Time grows at around 1 second per second. If you substituted pid from time, there's a good chance you would get the same seed by running the app twice in a row.So it's added instead, so that it always grows. And we pretend the wraparound happens very rarely.Broken behaviour? Sure. Practical solution that works for 99% cases where non-critical randomness is required? Definitely.

sarciszewskiover 10 years ago

For cryptography, there's really no reason to use rand(), mt_rand(), or the other insecure variants. No excuse, I should say.Just use urandom. Or getentropy() if your OS supports it.If you're not using it for cryptographic purposes, then I don't see why it matters. :)

评论 #8726146 未加载

评论 #8725213 未加载

andersover 10 years ago

> Perhaps there is a reason why software like Lua, Python, and Ruby all include their own implementation of a Mersenne Twister.As far as I know, Lua does not include Mersenne Twister. math.random() is just C rand().

评论 #8725272 未加载

评论 #8729683 未加载

评论 #8724563 未加载

7 comments

colmmaccover 10 years ago

评论 #8725985 未加载

评论 #8725549 未加载

sjolsenover 10 years ago

评论 #8727707 未加载

评论 #8726077 未加载

评论 #8725948 未加载

clarryover 10 years ago

评论 #8725368 未加载

mijoharasover 10 years ago

My favourite bit is: "Take 16 bytes of random data. No, wait, make that 15 bytes. Then hash it to four bytes to really squeeze the entropy in. Then seed."Good article

Random in the wild

7 comments

Random in the wild

7 comments