>Two, wikipedia refers to E(X) and E(Y) as the means, not the expected value. This gets even more confusing because, at the beginning of the Wikipedia article, it used brackets (E[X]), and now it's using parenthesis (E(X)). Is that the same value? Is it something completely different? Calling it the mean would be confusing because the average of a given data set isn't necessarily the same as finding what the average expected value of a probability distribution is, which is why we call it the expected value. But naturally, I quickly discovered that yes, the mean and the average and the expected value are all exactly the same thing! Also, I still don't know why Wikipedia suddenly switched to E(X) instead of E[X] because it stills means the exact same goddamn thing.<p>IIRC, the mean is the particular case of the expected value where the transformation function is the identity function.<p>The general expression of the expected value, where f(x) is the probability density function and g(x) is the transformation function, is:<p>E[g(X)] = \int{g(x) * f(x) * dx}<p>Let g be the identity function (g(x) = x), then E[g(X)] is the mean.<p>For example, let's say that you bet on a throw of dice such than you win 2$ if the result is odd, lose 3$ if the result is 2, and win or lose nothing otherwise. Here, the event X is the throw of dice: you have 6 possible outcomes. Your transformation function is the gain you get for each outcome, therefore g(x) is defined by:<p>1 -> +2<p>2 -> -3<p>3 -> +2<p>4 -> 0<p>5 -> +2<p>6 -> 0<p>Then, the expected value E[g(X)] can be calculated by summing all the g(x) multiplied by the probability p(x) for all x (all outcomes).<p>The first point of the article, in my opinion, is more about statistics than math in general, and I believe people find it weird because of the confusion between X and x, the former being a random variable associated with a distribution function, the ater being a value in the domain of X.<p>The second point of the article is just ease of notation. The derivative of a function doesn't exist; a derivative is always made with respect to a variable (df/dx), but if a function has only one variable, then the derivative of the function is defined as the derivative of the function with respect to the one and only variable and we don't need to write this only variable (f'(x)).<p>You could always write the variables of a function (df(x, y, z)\dy instead of df/dy), but that's just a waste of time, specially when you have to write it for every line of the demonstration.