TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

O-notation considered harmful (use Analytic Combinatorics instead)

37 pointsby jgrant27over 12 years ago

13 comments

rcohover 12 years ago
I became irritated when quick sort was claimed be O(n^2). Not only is that not true in practice, it's just wrong for all but the simplest implementations of the algorithm.<p>With minor tweaks to the selection of partitions, quick sort can be made O(n lg n). Specifically, if combined with linear time median finding, we can pick perfect partitions every time. (Of course, this algorithm is painfully slow in practice).<p>All this said, the idea of analytic combinatorics is interesting for a deeper understanding of algorithm performance. It quantifies behavior that good engineers understand is swept under the rug by Big-O notation.<p>Of course, if you really want good performance, you'll have to understand the algorithm beyond just it's combinatorial properties -- caching locality and coherence, branch mispredictions, function call-overhead etc. play an enormous role when it comes time to really make code fast.<p>To write fast code, you need the whole picture.
评论 #5199465 未加载
guard-of-terraover 12 years ago
They can't be serious if they don't provide a cheat sheet - a page worth of notations for common algorithms demonstrating how to use the thing and how it's superior.<p>Here in the real world we don't have time to read a 800+ pages book on a subject promising to bring marginal productivity benefits. Me, I'd better read some fiction instead. It's so much better for your soul than any CS topic.
评论 #5198296 未加载
julesover 12 years ago
Quick sort being O(n^2) and merge sort O(n log n) has <i>nothing</i> to do with O-notation (asymptotic analysis) vs analytic combinatorics (exact step counting). With analytic combinatorics you may get the exact number of steps the algorithm takes yes, but the leading term in quicksort's step count would still be n^2 and the leading term for mergesort's step count would still be n log n. The problem this post is about has <i>everything</i> to do with worst case vs average case. You can analyze average case and worst case both asymptotically and exactly.<p>As far as the practical utility of the techniques, for analyzing time, O-notation is clearly much better. Analyzing the exact number of steps taken is in most cases impossible, even with the powerful tools of analytic combinatorics. Analyzing the asymptotic time complexity is in comparison trivial. Even if you <i>do</i> manage to stumble on a case where it's possible to determine the exact number of steps, that still doesn't tell you anything about the constant factors involved in the algorithms, since you just have the number of steps for some definition of a step (those objecting that you could determine exactly how many CPU cycles an algorithm will take are living in the past -- with modern processors this is no longer feasible at all). The only thing that an exact step count can get you is non-leading terms that the O-notation hides. For example an algorithm might be O(n^2) when in fact the number of steps is n^2 + n. Knowing that extra +n is (almost) never of any practical importance, especially given that we're counting steps not time in seconds.<p>Don't get me wrong, analytic combinatorics is a beautiful subject and may even come in handy in <i>some</i> practical cases, but this post is vastly over-hyping it. By the way, even if you do want to count combinatorial structures exactly, instead of going the analytic combinatorics route in practice it often makes more sense to just define the exact count recursively and memoize. You don't get a closed form solution this way, but it is much quicker to do and can handle far more cases.
revelationover 12 years ago
Of course O-notation is mostly misguided, but then there are places where it comes back to bite you. Remember that DoS that worked on basically every web framework out there because everything is a clever hash-table nowadays? O(n^2) means very slow very quickly.<p>And I certainly wouldn't want to use a database with O(n) lookup or worse.
评论 #5198061 未加载
JD557over 12 years ago
I can't say I completly agree with this. Even though big-O notation is far from perfect, I assume analytic combinatronics has its problems as well: since it's based on the scientific method, it should have some errors (although I have not read about it, so I might be wrong).<p>Also, I never seen anyone claim that quick-sort is O(n^2). Usually it's considered only the average case, where it's also O(n log n). This is where I believe Analytic Combinatronics should come in: if you want to compare two algorithms with the same order of growth on the average case. Otherwise, I think it's better to use big-O (analytic combinatronics to compare linear vs. exponential growth algorithms seems a little bit of an "overkill").
评论 #5199481 未加载
rckover 12 years ago
Using complex analysis to study generating functions that describe time complexity is definitely interesting, and there are some cases where using generating functions can make a complicated analysis a bit easier to deal with, but I don't see any reason to think that analytic combinatorics is more practical on a day-to-day basis than big-O notation. Analytic combinatorics is much closer in spirit to the techniques that Knuth uses to analyze performance in The Art of Computer Programming, and I don't know many programmers who prefer Knuth's style of analysis to big-O.
patrickgover 12 years ago
Isn't that why we look at best/average _and_ worst case? Not only the worst case alone?
评论 #5198120 未加载
jiggy2011over 12 years ago
IMO, Understanding Big O helps you avoid doing stupid stuff like O(n!) where the problem can grow faster than your ability to purchase hardware to keep up. But you probably know when you are writing code like this intuitively.<p>If you have very performance critical code you <i>probably</i> have some sort of range of inputs in mind. In which case it's more practical to just do benchmarking and statistical based evaluation.
danekover 12 years ago
What is the origin of these articles that take the form "X Considered Harmful"?<p>What's wrong with adding the word "is" between "X" and "considered"?<p>Who is doing the considering?<p>What is X harmful towards?<p>Can anyone explain why this is a thing?
评论 #5198495 未加载
评论 #5198486 未加载
评论 #5198493 未加载
评论 #5198496 未加载
freeworkover 12 years ago
Most of the time, if you're thinking about big-O, you're practicing pre-mature optimization. Just write the code using your language's sort() function. If by profiling you determine the call to sort() is a bottleneck, then <i></i>AND ONLY THEN<i></i> should you consider analyzing the big-o implications of the different sort algorithms.
评论 #5197964 未加载
评论 #5197976 未加载
评论 #5198578 未加载
评论 #5197958 未加载
评论 #5198042 未加载
评论 #5198140 未加载
amit_mover 12 years ago
I don't get this post.<p>O notation specifies the asymptotic behavior of (mathematical) functions. e.g. sqrt(n) = O(n/log(n))<p>In order to use O notation to describe the performance of an algorithm, one must specify 1. What is being measured and 2. What is the class of inputs.
gnosisover 12 years ago
Cached version:<p><a href="http://jng.imagine27.com.nyud.net/index.php/2013-02-10-121226_analytic-combinatorics-is-better-o-nation-considered-harmful.html" rel="nofollow">http://jng.imagine27.com.nyud.net/index.php/2013-02-10-12122...</a>
tbirdzover 12 years ago
Just in case you missed it, the book is free off the author's website: <a href="http://algo.inria.fr/flajolet/Publications/book.pdf" rel="nofollow">http://algo.inria.fr/flajolet/Publications/book.pdf</a>