> Statisticians are quick to reach for the Central Limit Theorem, but I think there’s a deeper, more intuitive, more powerful reason.<p>> The Normal Distribution is your best guess if you only know the mean and the variance of your data.<p>This is putting the cart before the horse, for sure. The reason why you only know the mean and the variance of your data is because you chose to summarize your data that way. And, the reason why you chose to summarize your data that way is <i>in order to get the normal distribution</i> as the maximum entropy distribution.<p>The normal distribution appears in a lot of places because it is the limiting case of many other distributions, this is the central limit theorem. It is very easy to work with the normal distribution because you can add or subtract a bunch of normal distributions and the result is just another normal distribution. You can add or subtract a bunch of <i>other</i> distributions and the resulting distribution will often be more normal. You can do a lot of work with the normal distribution using linear algebra techniques.<p>So, you choose to measure mean and variance in order to make the math easier. This does not always result in the best outcome. For example, if you need more robust statistics, you might go for median and average deviation, rather than mean and variance. Then when you choose the maximum entropy distribution from the result, you end up with the Laplace distribution. The Laplace distribution is very inconvenient to work with mathematically, unlike the normal distribution.