That was a really interesting read, and very well written. I wonder if anyone can clear this up though...<p>I find the terminology of open and closed intervals contradictory to their meaning. Does anyone know why they are described like this?<p>`Closed` makes me think shut or not-including - however it includes its endpoints. `Open` makes me think inclusive - yet does not include its endpoints.
I always cringe when I see or have to write:<p><pre><code> for ( ; ; )
</code></pre>
or:<p><pre><code> while ( true )
</code></pre>
etc.<p>You just know you're setting yourself up for undocumented bugs later.<p>I've been known to build "escape" vars into these like:<p><pre><code> int attempts = 1000;
for(;attempts > 0;attempts--) {
** DO SOMETHING **
}
if (attempts <= 0) {
** BAIL ***
}
</code></pre>
Where 1000 or whatever number is a reasonable estimate of the function's need to loop x 10 or etc.
At least one, possibly two other bugs lurking in the implementation.<p>1) Algorithm FT says:<p><pre><code> 1. Generate u. Store the first bit of u
as a sign s (s=0 if u<1/2, s=1 if u>=1/2).
</code></pre>
and yet the C code implements<p><pre><code> if ( u <= 0.5 ) s = 0.0;
else s = 1.0;
</code></pre>
2) I can't be sure of the following w/o access to doc. But i4_uni() says<p><pre><code> a uniform distribution over (1, 2147483562)
</code></pre>
which, offhand, is suspicious. A distribution over positive integers would probably want to use <i>all</i> available values in a 32-bit signed int, so it would most likely end at 2^31 -1 which is 2147483647, and not the value given.
It would be safest for most rand() functions to omit both zero and one, unless a user was <i>really</i> sure they wanted otherwise. If we were generating real numbers, we'd <i>never</i> see precisely zero or one. The fact that we do is an artifact of limited precision. These boundary cases cause problems in common computations like u.log(u) or (1-u).log(1-u).
(Also posted 20 hours ago, also no comments.) (It occurs to me that if there are ever comments on this post, my comment will sound really confusing: let it be clear that this article is #1 currently, was posted 40 minutes ago, and there are no comments yet. ;P)<p><a href="https://news.ycombinator.com/item?id=8453042" rel="nofollow">https://news.ycombinator.com/item?id=8453042</a>
Thumbs up for not using rand, but the assuming that MT is a golden bullet is not exactly scientific; one should just test few RNGs, it may come in that the code exploits some ultra-hidden hole in MT or that some much faster RNG works equally well. Also reproducibility of a stochastic code means that the code results lead to the same conclusions regardless of the seed, not that you get bit to bit identical output for the same seed. In case one assumes (only) the latter, it may end up in seed cherry picking, not-optimizing code because it would "break reproducibility" or not investigating the natural deviation of the results.
A great exposition, and worth it just for the introduction to "schrödinbug". <a href="http://en.wikipedia.org/wiki/Heisenbug#Related_terms" rel="nofollow">http://en.wikipedia.org/wiki/Heisenbug#Related_terms</a>
Really interesting read.<p>I wonder about the resolution: was snorm() reimplemented correctly, or was an RNG with the 'wrong' interval supplied?