Big-O notation explained by a self-taught programmer

169 点作者 abrahms将近 12 年前

19 条评论

YZF将近 12 年前

The math of Big-O isn't that hard and the article while having good intentions misses the point. Big-O is about asymptotic behaviour and the graphs are misleading in that regard (well, they're simply wrong, not misleading). There are algorithms where if you just look at the Big-O you'd think one has faster run time than the other but because the constants fall out that wouldn't be the case for any practical problem size.What O(N) means is there is some large enough number where the run-time (edit: or any function really) is bounded by a constant times the input size for any input size larger than that number (see, math, not hard). That constant may be so large that an O(N^2) may be a better solution for any practical purpose.EDIT: As an example of this we can look at two multiplication algorithms, Karatsuba and Schönhage–Strassen, the latter having better asymptotic performance but that really kicks in once you have large enough numbers (100,000 digits or so). ( <a href="http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen_algorithm" rel="nofollow">http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen...</a> )

评论 #6111286 未加载

评论 #6111309 未加载

评论 #6112897 未加载

评论 #6111255 未加载

评论 #6111845 未加载

sillysaurus将近 12 年前

This sort of contributes to giving self-taught programmers a rather bad name. The writeup is good, but the idea that Big O is "scary" is just absurd. It's an elementary concept that every working programmer should be familiar with, regardless of whether they're self-taught. Algorithms are not "scary". If you can't reason about algorithms, you may not be a very good programmer yet.To be clear, I really appreciate the writeup. I just wish it had been framed better. It should be clear that this is for beginner programmers, regardless of whether they have a degree or whether they're self-taught.

评论 #6110928 未加载

评论 #6111162 未加载

评论 #6111768 未加载

评论 #6111119 未加载

评论 #6111065 未加载

yid将近 12 年前

Unfortunately, there are some misconceptions that are propagated in this article. Kudos on the effort, but some statements are just flat out wrong, such as this statement: "Big-O is all about the approximate worst-case performance". Big-O has nothing to do with worst-case, but is a bounding function. An O(n) algorithm is also O(n^2), O(2^n), etc. Those are valid bounds on the O(n) algorithm, just not the smallest.

评论 #6111100 未加载

评论 #6111185 未加载

评论 #6112660 未加载

评论 #6113204 未加载

tbrownaw将近 12 年前

O(N) is read "Order of N" because the O function is also known as the Order function. I think this is because we're doing approximation, which deals in "orders of magnitude".It's a different meaning of "order", that has to do with the shape of the size-vs-time curve. It's the same meaning as the order of a polynomial ("x" is linear or 1st order, "x^2" is quadratic or 2nd order, etc).but Big-O is all about the approximate worst-case performance of doing something.It can be, but I think it's more commonly taken to mean the average case. One example is quicksort, which is O(n*log(n)) on average but can be quadratic (n^2) in the worst case.

评论 #6112170 未加载

评论 #6111071 未加载

评论 #6111124 未加载

alanctgardner2将近 12 年前

The post is kind of misleading because it confuses "order of magnitude" with the order of a polynomial. To make this very clear:You have two functions, x^4 and x^3. You can multiply x^3 by any constant multiple a (any number, regardless of size), and as x gets sufficiently large x^4 will be bigger than a * x^3. This is the point of evaluating asymptotic performance: as your input data approaches an infinite size, only the highest order term in a polynomial really matters. For example, x^2 + x + 1 will approach x^2 for sufficiently large x - the lower order terms (x^1 and x^0) don't really matter for a big x.Technically, big-O refers to a bounding function: x^2 is O(x^2) AND O(x^3), etc. because x^2 is less than or equal to x^2 and x^3 as you approach infinity. For convenience we're usually only interested in the best fit; it's useless to say your algorithm is faster than O(x^5), but it's interesting if it's O(n).Finally, we also have small-oh notation, which is a lower bound. If your algorithm is never faster, on an ideal input set, than x^2, it's also o(x^2). Note that the same algorithm is also o(n) and o(1), because any algorithm which is slower than (or equal to) x^2 must necessarily be slower than x or 1 (constant-time).edit: It's worth pointing out I only talked about polynomials because it's pretty intuitive. You can extend this notation to any class of functions - exponentials, logarithms, etc. but you just have to know that log(x) is O(x) and o(1), etc.

评论 #6111664 未加载

ambiate将近 12 年前

I find Big-O is mostly applicable to SQL queries. A hard drive/memory bottleneck exists in all database queries. Programmers will easily ignore bad SQL and chalk it up to 'database bottleneck etc'. The truth usually goes along the lines of 'I do not understand temporary tables or views. I just SELECT. I opened a 30,000 row cursor, then, a 1,000,000 row cursor, and finally another 500,000 row cursor and got the data (while storing all of the fetches in leaking arrays)!' Usually on keys without indices or tables without primary keys.SELECT a, b, c FROM abc (30,000 rows)BEGIN<pre><code> SELECT e, d, f FROM edf WHERE e = a (30k * 1m) BEGIN SELECT x, y, z FROM xyz WHERE d = x (30k*1m*500k scanned) BEGIN process() END END </code></pre> ENDA customer or manager can all find respect in lowering the growth rate of a query. No mathematics required. Simply, "Your report now runs in 30 seconds instead of 20 minutes." Anyone can compute that!The mathematics of Big-O gets annoying for average case scenarios. Instances ofdef fun(abc):<pre><code> l=[] for x in abc: if x%2==0: for y in abc: l.append(y) return l</code></pre>

shoyer将近 12 年前

This is a nice explanation, but I couldn't help but notice that the estimate of the number of gumballs in the pictured machine seems closer to 1000 than 100 (contrary to the claim in the article).You can actually see about 100 gumballs in the picture, so there must be far more hidden behind them -- my guess is closer to 500, which is about twice as close (in order-of-magnitude) to 1000 as to 100.

评论 #6111080 未加载

cliveowen将近 12 年前

I don't know first thing about math, but even I find that the classical definition (i.e. that find in the CLRS book) is pretty straightforward: given an input big enough, the running time will be at most that of a multiple of function g(n) if f(n) = O(g(n)), where f is the function that describes the algorithm's running time.

评论 #6110918 未加载

fmax30将近 12 年前

I don't really know why Big-O notation is so common , even though Big O is the upper bound . For me it is more practical and logical to use the Big Θ (Theta) notation as it provides a tighter bounder which is more understand able. Also Big O is very misleading to the new comers, as they are usually confused when they see something like O(n) = O(n^2) which is perfectly valid , as the Big O notation is only the upper bound albeit it will be a loose upper bound.For all we care , we can write the Big O offor ( i = 0 ; i < n ; i++) { cout<<"I am constant time operation"; }as O( n!) , it won't be mathematically wrong but again it would be very misleading and loose :). So my advice to everyone is to use the Big Θ notationAs f(x)= Big Θ(g(x)) when f(x) = Big O ( g(x) ) and Big Ω(g(x)) .Here for those that don't know what Big Ω(g(x)) (read Big omega) is, it is a lower bound . In English that would be that your loop will execute/iterate at least g(x) times.Now before people get any more confused Big Θ(g(x)) is a tight bound , that means that your code/loop will run at least C1 * (g(x)) and at most C2 * (g(x)) . where C1 and C2 are two constants .If anyone is interested they should really read CLRS. It has an excellent chapter on calculating and explaining the time complexities.

评论 #6112115 未加载

评论 #6112151 未加载

bps4484将近 12 年前

This is definitely a good starting point. If you wanted to expand this, I would recommend diving into how the relationship between the real time an algorithm takes to execute and the order of the function, and constants or lesser order terms are unimportant with order notation (can best be shown with graphs, which has already been introduced).One thing I would stay away from is the talk about "orders of magnitude" because the order of a function and an order of magnitude are very different topics, and it could cause a reader to make bad conclusions like "one algorithm takes 10 seconds to run, one take 100 to run, they must have different big-O notation". I understand why the analogy is made, it's because you're estimating things from a high level, but I think it could cause confusion on the fundamentals.

Hannan将近 12 年前

Nice writeup!That said, I got ~500 gumballs in the machine ((container diameter / gumball diameter)^3 * .64) so both guesses of 100 and 1000 should be within an order of magnitude. ;)

评论 #6111033 未加载

calhoun137将近 12 年前

One of the nice things about big-O notation from a mathematical point of view is that it allows you to avoid dealing with limits.As everyone knows, calculus is based on limits, but when handled rigorously limits have a lot of subtle pitfalls that take a lot of work to deal with. For example, given a function f(x,y), if we want to take one limit after another, the result might depend on the order in which the limits are taken. In other words, taking limits is not commutative unless certain conditions are satisfied.None other than the great Donald Knuth went so far as to claim that all of calculus could be taught without limits! Quoting from [1]:"Students will be motivated to use O notation for two important reasons. First, it significantly simplifies calculations because it allows us to be sloppy — but in a satisfactorily controlled way. Second, it appears in the power series calculations of symbolic algebra systems like Maple and Mathematica, which today’s students will surely be using.For more than 20 years I have dreamed of writing a calculus text entitled O Calculus, in which the subject would be taught along the lines sketched above."[1] <a href="http://micromath.wordpress.com/2008/04/14/donald-knuth-calculus-via-o-notation/" rel="nofollow">http://micromath.wordpress.com/2008/04/14/donald-knuth-calcu...</a>

nhamann将近 12 年前

This article has way too many words and not enough math. There is, in fact, nothing scary about big O notation once you dissect it, and it's a shame that so many people seem to think otherwise.Here's the definition: if f and g are functions (let's say real-valued functions defined on the positive reals), then we say that f is big O of g, written f = O(g), if there exists a real number y and a real number K, K > 0, such that<pre><code> f(x) <= K * g(x) </code></pre> for every x > y.If that makes sense, then done.Else:The first thing to do when meeting any mathematical definition that you don't understand is to throw away parts of the definition until you do understand it, then add them back in one by one. In this case, let's forget about the constant.New definition: For functions f, g, f is blorb of g, denoted f = blorb(g), if there is a y such that<pre><code> f(x) <= g(x) </code></pre> for every x > y."f is blorb of g" actually just means that there comes a point after which g is never smaller than f. This gives us the first ingredient of big O: we are concerned only with asymptotic behavior. f could take on immense values for small x and still be O(g) as long as f eventually becomes always smaller than g.The reason for caring about asymptotic behavior is that we often don't care about the time complexity of an algorithm for very small problem sizes. Even the traveling salesman problem is solvable on Raspberry Pi for very small problem sizes.Okay, I hope we understand the above definition. Now we add the constant back into the fold and see if we can make sense of it. From what I can see, the constant is there for computer scientists who want to paint with broader strokes. There can be a huge practical difference between f1(n) = 2n and f2(n) = 2000n (the difference between a computation taking a day and taking 3 years), but they're both O(n) because complexity theorists are more concerned with O(n^2) versus O(2^n) than they are with O(n) versus O(5n). (Also could be because in practice algorithms with wildly varying constant factors out in front are rarely seen?)For an alternative to big O notation, you should check out Sedgewick and Wayne's Algorithms, 4th ed. They use something they call "tilde notation" which preserves the leading constant factor. (See: <a href="http://introcs.cs.princeton.edu/java/41analysis/" rel="nofollow">http://introcs.cs.princeton.edu/java/41analysis/</a>)

评论 #6112215 未加载

oskarkv将近 12 年前

First of all, O(g(n)) is a set. It is the set of functions f(n) such that there exists positive constants n0 and C, and C*g(n) > f(n) when n > n0.Second, talking about O(g(n)) does not imply that the time complexity being discussed is the worst-case (or any other case) time complexity. One could for example say that the algorithm A's best-case time complexity is in O(n), and it's worst-case time complexity is in O(n^2).

stormbrew将近 12 年前

I think the difficult part for someone with no math background isn't so much the ones he outlines here, which have fairly obvious causes that follow directly from the code, but the various logarithmic complexities which require a bit more reasoning. Certainly that's what always tripped me up before I put some effort into understanding it more.

justinlilly将近 12 年前

Not sure if others are interested, but I'm considering writing a book on CS topics with self-taught programmers in mind. If that's interesting to you, check out <a href="https://leanpub.com/computer-science-for-self-taught-programmers/" rel="nofollow">https://leanpub.com/computer-science-for-self-taught-program...</a>

tryitnow将近 12 年前

I'm a newbie, and I enjoy articles like this as a starting point to frame the overall concept.Like most simplifications, it's not entirely correct.Which is why I like the comments section of HN - where I can learn more about the concept by observing how a wide variety of people explain it (and in the process correct the errors of the original article).

datalus将近 12 年前

I always get big and little oh mixed up.

tomrod将近 12 年前

I don't know if I could explain it much better myself. Kudos.