TechEcho

7 comments

imurrayalmost 13 years ago

Warning: slight pedantry ahead.Specifically what this article is noticing is the law of large numbers [1], that sums or averages are "close" to their mean. The central limit theorem says that sums converge in distribution towards being normally distributed. But convergence in distribution isn't a strong statement about tail behavior, and so shouldn't be used in the sort of argument in the post.[1] <a href="http://en.wikipedia.org/wiki/Law_of_large_numbers" rel="nofollow">http://en.wikipedia.org/wiki/Law_of_large_numbers</a>

streptomycinalmost 13 years ago

If the goal is to randomly probe the parameter space of a model, the author is correct that independent Gaussian random numbers are a poor choice (although it's not really related to the central limit theorem). However, this isn't a novel revelation. There are other/better ways to do it, for instance <a href="http://en.wikipedia.org/wiki/Latin_hypercube_sampling" rel="nofollow">http://en.wikipedia.org/wiki/Latin_hypercube_sampling</a>

评论 #4134144 未加载

Darmanialmost 13 years ago

While it is well known that uninformed random test generation often does a poor job at probing the parameter space, this has nothing to do with the CLT. Even his first example with the bounded queue, one where the CLT does make predictions, he is incorrect to apply it. That's because what we care about is not the difference between the number of enqueue and dequeue operations (which has expected value 0), but the magnitude of the difference (which has expected value sqrt(N)). And what we care about is not the magnitude of the difference after completing the tests, but the greatest magnitude at any point along the random walk. This is governed by a more advanced theorem called the Law of the Iterated Logarithm, which tells us that the probability of finding the bug is quite a bit higher than the author realizes.

lliiffeealmost 13 years ago

The central limit theorem only applies when random variables are combined linearly. I see no reason to suspect that this is the case with random inputs for testing. (Most functions are nonlinear!)

DaniFongalmost 13 years ago

Why the central limit theorem is usually off:<a href="http://daniellefong.com/2008/01/28/outliers-why-the-central-limit-theorem-is-typically-off/" rel="nofollow">http://daniellefong.com/2008/01/28/outliers-why-the-central-...</a>

评论 #4133688 未加载

bcbrownalmost 13 years ago

I think one approach would be to use RNGs in a slightly different way. Instead of a 'random walk' of various API calls, define a possible outcome category, and choose a random number to be the quantity of that outcome.So for the bounded queue problem, set the outcome category to be "elements in queue". Say you get 3, 12, and 7 as your first three random numbers. Then the first three test cases would be "get the load to 3 elements", "get the capacity to 12 elements", "get the capacity to 7 elements".

评论 #4134723 未加载

zerostar07almost 13 years ago

I can't understand the relation to CLT. When using an RNG you basically sample from the same distribution (thus have only 1 random variable) plus the CLT comes into play when you add (sum, average) different random variables. In the case of the queue, it has nothing to do with the number of cases missed.

The Central Limit Theorem Makes Random Testing Hard

7 comments

The Central Limit Theorem Makes Random Testing Hard

7 comments