Mathematical Notation Is Awful

69 点作者 blackhole将近 9 年前

29 条评论

chilie将近 9 年前

These are just issues that someone unfamiliar with a field would face. None of them are problems for those of us in the field.First, the expectation thing. He's using a special case, E(X), and complaining that the more general case doesn't follow the general case. It's like saying "Well the plural of mouse is mice but the plural of house isn't hice!". The general definition of expectation (for a discrete probability) isE(f(x)) = sum f(x_i)* p(x_i)If you start with this general definition, both E(X) and E(X^2) are perfectly natural. The author's error of starting with the special case in no way implies an issue with notation.And how is the fact that Wikipedia is inconsistent between E(X) and E[X] in any way mathematical notation's fault? If you read a novel that starts using ' for quotes and switches to ", that's an issue with the novel (assuming its not stylistic) and not an issue with the typography in general.

评论 #12196424 未加载

评论 #12196440 未加载

评论 #12196446 未加载

评论 #12196545 未加载

评论 #12197260 未加载

评论 #12196441 未加载

评论 #12196488 未加载

评论 #12196500 未加载

评论 #12196456 未加载

评论 #12205338 未加载

jamessb将近 9 年前

I think the fundamental source of confusion might be due to not understanding the concept of a random variable. In his example, X is a random variable; the expectation E[X] is an functional applied to its probability mass function. Given this, it should not be surprising if it seems to behave differently to "any other function, in math".If you understand this, I think the notation is natural:We have a random variable X, which takes value x_i with probability p(x_i). Thus, the random variable X^2 will take value (x_i)^2 with probability p(x_i).Given that the expection for the random variable X with p.m.f p(x_i) is defined as E[X] = \sum x_i p(x_i), it should be clear that to obtain the expectation of any random variable we must sum over the product of (value) and (probability of that value). It should also be clear that this gives E[X^2] = \sum (x_i)^2 p(x_i)I'm confused by his comment that: p(xi) isn't, because it doesn't make any sense in the first place. It should really be just PXi or something, because it's a discrete value, not a function!The probability mass function is a function: for a given value, it gives the probability that the the discrete random variable takes that value. To calculate the expectation we use the values obtained by evaluating the function at discrete points, but what else could we do?

评论 #12196474 未加载

markhkim将近 9 年前

If you think probability theory is bad...The old joke that "differential geometry is the study of properties that are invariant under change of notation" is funny primarily because it is alarmingly close to the truth. Every geometer has his or her favorite system of notation, and while the systems are all in some sense formally isomorphic, the transformations required to get from one to another are often not at all obvious to the student. —John M. Lee, "Introduction to Smooth Manifolds"

gnuvince将近 9 年前

I started doing a lot better in calculus when I started using longer notation (e.g. f = x -> x³ instead of f(x) = x³) and making sure that things "type checked". For instance, this tendency to use f(x) to refer to a function, rather than just f, was very confusing to me, because f(x) is an element of the co-domain while f is a function (typically from real to real in my undergrad classes). I had to figure this out by myself because the textbook I was using and the prof all went with the notation that wouldn't type check. When I finally realized that dy/dx should instead be (d/dx)(f), things started being a lot clearer to me: derivation takes a function and returns a function and f is a function so everything checks out.

评论 #12197022 未加载

mrottenkolber将近 9 年前

This a thousand times. I boycott math notation. My most memorable math moments are:- giving a talk on notation in high school (I was not tasked to do this, I decided to do it on my own) because I was FREAKED OUT by how we were using tons of symbols nobody had ever explained or defined- converting all Math I encountered in University to Common Lisp programs to get rid of the shit notation- bursting into crazy laughter when after five algebra lectures the prof notices that the students parse his notation differently than he doesGive it names, use S-Expressions.

评论 #12196561 未加载

评论 #12196416 未加载

评论 #12196427 未加载

评论 #12196662 未加载

yorwba将近 9 年前

I had an exam in probability theory yesterday, so these topics are still quite fresh in my mind. The confusion already starts when he uses the equation<pre><code> E[X] = \sum_{i=1}^{\infty} x_i p(x_i) </code></pre> for the expectation. In my class it was introduced as<pre><code> E[X] = \sum_{\omega \in \Omega} X(\omega) P(\omega) = \sum_{ x \in X(\Omega)} x p_X(x) </code></pre> which totally makes sense if you know that X is a function that assigns a value to each possible outcome. In most cases we don't actually care about the outcomes, so there is the second description using p_X.The subscript X is important to highlight that p is not just some arbitrary function, it is p_X, the probability mass function of X.Now when you want to compute the expectation of X^2, you use<pre><code> E[X^2] = \sum_{\omega \in \Omega} X^2(\omega) P(\omega) = \sum_{x^2 \in X^2(\Omega)} x^2 p_{X^2}(x^2) </code></pre> i.e. the substitution he wanted to do actually works when you make the dependency of p_X on X explicit.Now p_{f(X)} is not that easy to compute from p_X in general, because you have to account for multiple possible ways to reach the same value, e.g. x^2 = (-x)^2. For f(x) = x^2 we have<pre><code> p_{X^2}(x^2) = p_X(x) for x = 0 and = p_X(x) + p_X(-x) otherwise </code></pre> If f is more complex, there is a third way using<pre><code> E[f(X)] = \sum_{x \in X(\Omega)} f(x) p_X(x) </code></pre> which amounts to the same thing, but is usually easier to calculate.

bjornsing将近 9 年前

One interesting observation that popped into my mind when I read the OP: Mathematicians don't write mathematical notation in papers/books by hand anymore, they use a far more verbose language called LATEX.Wouldn't it be great if every time you saw a mathematical formula there was a little widget to push that would show you the "source code" in LATEX++, and LATEX++ was like LATEX but made up of stringently defined mathematical operations (like '\element_wise_multiplication' instead of '\plus_sign_with_circle_around_it')? :D

评论 #12196412 未加载

评论 #12196414 未加载

评论 #12196435 未加载

bsaul将近 9 年前

I remember being so outraged when i was first introduced to derivative in high school... Seeing that they didn't use the same notation nor exact same definition in my physics class and maths class, in the same year, that was making me absolutely furious...

billconan将近 9 年前

I have similar feelings with music notation too. we never applied our "user experience" standards to mathematical notation and music notation. If these things were invented today, we might come up with better ideas.One painful experience I have reading math is telling which variables are "vectors" and which are "scalars". another is the similar looks of certain greek characters and english characters, such as alpha and a.

评论 #12196548 未加载

johnhenry将近 9 年前

Somewhat related: <a href="https://aeon.co/videos/maths-notation-is-needlessly-complex-it-can-and-should-be-better" rel="nofollow">https://aeon.co/videos/maths-notation-is-needlessly-complex-...</a>

评论 #12196434 未加载

zimbatm将近 9 年前

If I could fix one things in Maths, it would be to introduce an explicit import statement. Right now it's very hard to work back what the symbols mean in a specific context unless you're familiar with the field.<pre><code> from url/to/geometric-algebra.pdf import X; </code></pre> I don't mind the overloading too much and it would always be possible to alias symbols in case two or more fields are user together.

评论 #12197317 未加载

gerbilly将近 9 年前

Math notation is a language that evolved over centuries.The English language isn't consistent but we all seem to be able to use it to communicate here.Same thing with maths. Some notations are holdovers from earlier eras, but we still introduce them to students in case the run into it in an older book (dx/dy for example).And maths isn't just about computation. It's also about expressing ideas, and sometimes that is easier when the notation isn't rigidly 'executable' as some posters here would have it.This article also reminds me of: <a href="http://knowyourmeme.com/photos/582861-reaction-images" rel="nofollow">http://knowyourmeme.com/photos/582861-reaction-images</a>

paulpauper将近 9 年前

Leibniz's notation is useful for the chain rule and change of variables for parametric based equations

评论 #12196396 未加载

hgibbs将近 9 年前

The post is a pretty childish rant. One of the great facets of mathematical writing (and not learning, unfortunately) is that you can explicitly define your own notation, and then use said notation whenever you want. The author even notes this in his final paragraph, but doesn't seem to see it as an advantage of mathematical writing.Prior to modern notation, mathematics was written out in english in full. What we have now is significantly better than what existed before modern mathematical shorthand.Also, the expectation is an operator and not a 'function' (in the sense that it does not take values in one of the canonical scalar fields e.g. R, C). The notation makes perfect sense in this setting. For example, the expression E[x^2] should be interpreted as E acting on the function x -> x^2 and not on a number x.

nshm将近 9 年前

As a developer I'm not quite happy with a single-letter names. Of course it's ok for minor local things to be named 'x' but for more important values and functions there is no problem these days to have a readable names just like in software. So you could actually read of a paper without guessing and searching those epsilons, lambdas and cappas and cryptic symbols. Use expectation(x) instead of e(x), use mean(y) instead of \hat y and so on.

评论 #12198193 未加载

erdewit将近 9 年前

In my experience people that are very verbose, that talk or write at great length and with ease, tend to have a dislike for terse formulations. Sometimes to the point of being offended by it. Math is just about the ultimate in terse notation and perhaps the author is one of these verbose people.The same thing can be seen with programmers: The verbose programmer writes a lot of lines of code and is proud of it, while the terse programmer is proud to remove or simplify code.

htns将近 9 年前

This is just really weakly argued. Firstly, the notation does use capital X and and lowercase x. You need to realize these are different. It's not substitution to go from X^2 to x^2. Capital letters are not real numbers, and thus it's a type error too. Secondly, if you are familiar with differentiation and integration, you are familiar with straight substitution not always being correct.

amelius将近 9 年前

One of the cool things of Mathematica is that it solves these problems. But the result is, unfortunately, somewhat more verbose.

k__将近 9 年前

their naming isn't that good either.All these words that are already used for other things.Magma, ring, group, body, lense, optic...

评论 #12196350 未加载

jokoon将近 9 年前

Funny, I said the same about Andrew Ng's ML course.Ultimately, if what you're teaching is going to end up in software, why use math at all? Use code or pseudo code. I don't think it's bad to just give the working algorithm without having to prove the math.Really how many students will end up being computer scientists anyway, and research and write about new methods of doing AI, and do the actual math? So few. I guess that's a simple criticism of academics.It's just easier to work with code than mathematical notation most of the time, in my view. You can't replace math, of course, but when things are simple enough, it could be avoided. It's a matter of making math accessible to the most people.Code is amazing because a computer can check to see if it works. A computer doesn't understand math.

评论 #12196627 未加载

krapht将近 9 年前

Domain-specific languages, notation, and vocabulary exist for many disciplines. They are for the benefit of experienced practitioners and allow them to express common concepts and ideas in a precise and concise fashion to those who have been also been trained in the art.Sometimes jargon, abbreviations, and cryptic notation exist to artificially make it harder for outsiders to understand what's going on, but the examples in this post don't convince me. The true issue here is that the appropriate Wikipedia article has no link to the "Simple English" version.Most humans are capable of using context to distinguish usage. Operator overloading in programming languages and in mathematics can be useful sometimes.

评论 #12196373 未加载

评论 #12196407 未加载

panic将近 9 年前

If you're doing math, you need to be able to manipulate the symbols easily. These notations develop because they're easy to work with: consistency is less important than brevity. Lambda calculus terms may be easier to read, but they're much slower to write, and and they take up more space on the page or blackboard.That said, once you've arrived at some kind of result, you could switch to a more consistent notation for explaining whatever you've found. It doesn't necessarily have to be textual, either (see <a href="http://worrydream.com/KillMath/" rel="nofollow">http://worrydream.com/KillMath/</a>).

everyone将近 9 年前

I just thought.. It would be great to have a website where mathematical formulas are expressed as pseudocode.

Longwelwind将近 9 年前

>Two, wikipedia refers to E(X) and E(Y) as the means, not the expected value. This gets even more confusing because, at the beginning of the Wikipedia article, it used brackets (E[X]), and now it's using parenthesis (E(X)). Is that the same value? Is it something completely different? Calling it the mean would be confusing because the average of a given data set isn't necessarily the same as finding what the average expected value of a probability distribution is, which is why we call it the expected value. But naturally, I quickly discovered that yes, the mean and the average and the expected value are all exactly the same thing! Also, I still don't know why Wikipedia suddenly switched to E(X) instead of E[X] because it stills means the exact same goddamn thing.IIRC, the mean is the particular case of the expected value where the transformation function is the identity function.The general expression of the expected value, where f(x) is the probability density function and g(x) is the transformation function, is:E[g(X)] = \int{g(x) * f(x) * dx}Let g be the identity function (g(x) = x), then E[g(X)] is the mean.For example, let's say that you bet on a throw of dice such than you win 2$ if the result is odd, lose 3$ if the result is 2, and win or lose nothing otherwise. Here, the event X is the throw of dice: you have 6 possible outcomes. Your transformation function is the gain you get for each outcome, therefore g(x) is defined by:1 -> +22 -> -33 -> +24 -> 05 -> +26 -> 0Then, the expected value E[g(X)] can be calculated by summing all the g(x) multiplied by the probability p(x) for all x (all outcomes).The first point of the article, in my opinion, is more about statistics than math in general, and I believe people find it weird because of the confusion between X and x, the former being a random variable associated with a distribution function, the ater being a value in the domain of X.The second point of the article is just ease of notation. The derivative of a function doesn't exist; a derivative is always made with respect to a variable (df/dx), but if a function has only one variable, then the derivative of the function is defined as the derivative of the function with respect to the one and only variable and we don't need to write this only variable (f'(x)).You could always write the variables of a function (df(x, y, z)\dy instead of df/dy), but that's just a waste of time, specially when you have to write it for every line of the demonstration.

daniel-levin将近 9 年前

The problem here is not the notation. Mathematical notation is not perfect, and can sometimes be confusing. Let me say this in a minimally offensive manner, without being obfuscatory: the author of this piece does not understand the mathematics underpinning the notation he is using. The root cause seems to be the use of probability theory in a cookbook manner. It can hardly be surprising that confusion results.The first example is a formula. An instance of magic, in the sense that you use it to compute, without knowing what it does. The $x_i$'s are not quantified. What are they? Are they real numbers? Matrices? Elements of some semi-group? How can you expect to understand the "formula" if the summand is not explained? At best, I can say that it is a formal sum of something. We can forget discussing convergence or it being well-defined. You can cook up arbitrarily 'nice' notation. It won't help. This notation is absolutely fine for someone who can infer that the support of the distribution of X is some denumerable set {x_i}, equipped with p.m.f. p.A suitable definition of the expected value (as an operator) would have cleared up all the confusing with the variance and E[X^2] vs (E[X])^2. This confusion is not the notation's fault. It is the user's fault for not knowing what E[f(x)] means (for some appropriate meaning of the symbol f).>> Only the first xixi is squared. p(xi)p(xi) isn't, because it doesn't make any sense in the first place. It should really be just PXiPXi or something, because it's a discrete value, not a function!Functions are not algebraic expressions by which we associate one real number with another. In fact, we call p(x_i) the probability mass function. It seems to be a common flaw in many undergrad programs. Formulas and functions are never made distinct. The vast majority of functions f : R -> R do not admit an expression in a formula.The example with the different notation for "derivatives" is a good non-example. The so-called Leibniz notation is used because it allows people to make statements with differential forms, without needing to invoke exterior algebra. If this is done correctly, statements such as "dy = f'(x)dx" can be made fully rigorous, if need be. Students are told that dy/dx is not a fraction, and yet it is used exactly as though it were. This confuses people - because they don't know what is going on. The dot-notation for derivatives is extremely useful in classical mechanics.Notation is a clutch for succinct and meaningful writing amongst the initiated. One cannot expect to be able to use these tools without knowing what is going on, or by suspending a great deal of questions.>> There must be other ways we can explain math without having to explain the extraordinarily dense, outdated notation that we use.My final gripe with this post. We typically use clean and modern notation. It could be so much worse! Also, if we didn't re-use symbols, then we would run out, very quickly. Mathematics exists independently of the symbols we use to communicate it.

arekkas将近 9 年前

when you post this on facebook your account will be blocked and you're forced to download anti-malware from kaspersky. outlier detection at its worst

donorman将近 9 年前

Seems like you are talking more about statistal notation than mathematical notation. Imo the topic is a tad off.

donorman将近 9 年前

Seem like you are talking more about statistical notation than mathematical notation. Please correct me if I'm wrong, I just think the topic is a tad off.

efz1005将近 9 年前

Real maths is done either in books or papers, where the authors explain every bit of notation they will use. This is not the case in online resources, which is indeed a bad practice... But hey, don't blame maths for that.And regarding the "understanding maths in terms of computer programs", sure that's possible to do with _some_ topics, but you just can't expect a computer to represent _every_ concept you'd like.

评论 #12196361 未加载

29 条评论

chilie将近 9 年前

评论 #12196424 未加载

评论 #12196440 未加载

评论 #12196446 未加载

评论 #12196545 未加载

评论 #12197260 未加载

评论 #12196441 未加载

评论 #12196488 未加载

评论 #12196500 未加载

评论 #12196456 未加载

评论 #12205338 未加载

jamessb将近 9 年前

评论 #12196474 未加载

markhkim将近 9 年前

gnuvince将近 9 年前

评论 #12197022 未加载

mrottenkolber将近 9 年前

评论 #12196561 未加载

评论 #12196416 未加载

评论 #12196427 未加载

评论 #12196662 未加载

yorwba将近 9 年前

bjornsing将近 9 年前

评论 #12196412 未加载

评论 #12196414 未加载

评论 #12196435 未加载

bsaul将近 9 年前

billconan将近 9 年前

评论 #12196548 未加载

johnhenry将近 9 年前

Somewhat related: <a href="https://aeon.co/videos/maths-notation-is-needlessly-complex-it-can-and-should-be-better" rel="nofollow">https://aeon.co/videos/maths-notation-is-needlessly-complex-...</a>

评论 #12196434 未加载

zimbatm将近 9 年前

评论 #12197317 未加载

gerbilly将近 9 年前

paulpauper将近 9 年前

Leibniz's notation is useful for the chain rule and change of variables for parametric based equations

评论 #12196396 未加载

hgibbs将近 9 年前

nshm将近 9 年前

评论 #12198193 未加载

erdewit将近 9 年前

htns将近 9 年前

amelius将近 9 年前

One of the cool things of Mathematica is that it solves these problems. But the result is, unfortunately, somewhat more verbose.

k__将近 9 年前

their naming isn't that good either.All these words that are already used for other things.Magma, ring, group, body, lense, optic...

评论 #12196350 未加载

jokoon将近 9 年前

评论 #12196627 未加载

krapht将近 9 年前

评论 #12196373 未加载

评论 #12196407 未加载

panic将近 9 年前

everyone将近 9 年前

I just thought.. It would be great to have a website where mathematical formulas are expressed as pseudocode.

Longwelwind将近 9 年前

daniel-levin将近 9 年前

arekkas将近 9 年前

when you post this on facebook your account will be blocked and you're forced to download anti-malware from kaspersky. outlier detection at its worst

donorman将近 9 年前

Seems like you are talking more about statistal notation than mathematical notation. Imo the topic is a tad off.

donorman将近 9 年前

Seem like you are talking more about statistical notation than mathematical notation. Please correct me if I'm wrong, I just think the topic is a tad off.

efz1005将近 9 年前

评论 #12196361 未加载