Great article, lots of things I've ranted about in the past, and lots of things I've never considered.<p>An apocryphal story from a former finance professor who taught me about MCMC: One of his former colleagues was working for a hedge fund in commodity futures markets, and he had developed a monte carlo model for trading in a very specific market that was working exceptionally well. Then one day it stopped working so well...returns cut by 1/2 of their original level. It wasn't a gradual decline, but rather one day it was one level and then overnight that changed. He eventually got poached by a competing hedge fund working in the same exact markets, and he found out that the reason his returns declined was because one of the mathematicians at his new company had almost perfectly reverse engineered his model by guessing his RNG seed, conveniently a number in the name of his former hedge fund.
In the example about picking a random 3-d vector, it seems like the "draw components from a Gaussian distribution" method is the most common, but I don't really understand why you can't just pick two angles ([0, pi) and [0, 2pi) respectively) from a uniform distribution and interpret those as spherical coordinates on a unit sphere.<p>Given that the "draw from a Gaussian and normalize" seems like the hard way to do it and also is the only one anyone is suggesting, I assume I'm missing something. Anyone know what the problem is?
As with cryptography, it's probably dangerous to "roll your own random" <i>if</i> the randomness matters. The author already mentions both the point about cryptography, and the dangers of plausibility arguments about randomness: an intuitively plausible way of picking a random point from a sphere doesn't give a uniform distribution.<p>Playing around with distributions as the author does is surely fine if you just want to get something that "feels right", but if the applications depend on precise randomness properties, then, for example, "let's just multiply these two PDFs" is dangerous (not least because the result is almost guaranteed not to be a PDF, and may not even be normaliseable to one).<p>Although it surely won't be applied for random f (no pun intended), the transformation from P(f(u) ≤ x) to P(u ≤ f^(-1)(x)) relies very much on f being (strictly, in order to have an inverse) increasing—so it's even dangerous to use f(x) = x^2 if we don't know that u is non-negative-valued.
It reminds me of this nice link on how to generate Gaussian from uniform, using Box-Muller transformation:<p><a href="http://www.design.caltech.edu/erik/Misc/Gaussian.html" rel="nofollow">http://www.design.caltech.edu/erik/Misc/Gaussian.html</a>
I discovered through experimentation recently that GNU awk (gawk) takes only signed 32 bit values.<p>A loop of 10 million iterations of straight rand() output produces unique values only about 2% of the time -- the other 98% of values are repeated throughout the sequence. (This may be due to time-of-day as seed.)<p><pre><code> gawk 'BEGIN {srand(); for (i=0;i<10000000;i++) printf("%s\n", rand())}' | sort | uniq -c | wc -l
</code></pre>
The srand() feature appears to take in signed 32-bit values only -- that is, -2147483647 to 2147483648. If you require more than 4.2 billion distinct sequences, this might be something to keep in mind.<p>This information may be well documented, though I find it in neither the gawk manpage (yes, I'm aware FSF deprecates manpages, an idiotic move), nor the online gawk manual, linked below.<p>Again -- if you're just playing around, this may not hurt you, but if you're fond of gawk and think you can develop high-strength crypto or security code using it, you're going to need to go beyond the built-ins at the very least.<p>Earlier: <a href="https://plus.google.com/104092656004159577193/posts/exhAxhd4v2n" rel="nofollow">https://plus.google.com/104092656004159577193/posts/exhAxhd4...</a>
> [Tetris] simply shuffles a list of all 7 pieces, gives those to you in shuffled order, then shuffles them again to make a new list once it’s exhausted.<p>Interesting tidbit! So all the times I've furiously cursed at my gameboy because I swear the 'I' is by far the rarest tetrimino and it hasn't given me one for at least 20 turns... was just classic cognitive bias.
> If your random number generator has fewer than 226 bits of state, it can’t even generate every possible shuffling of a deck of cards!<p>Anyone know why?
As for choosing random 3D directions with various distributions (and more), the Global Illumination Compendium [1] has a lot of useful formulas.<p>[1] <a href="https://people.cs.kuleuven.be/~philip.dutre/GI/" rel="nofollow">https://people.cs.kuleuven.be/~philip.dutre/GI/</a>