TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Faster than C? Parsing binary data in JavaScript.

219 点作者 tuxychandru超过 12 年前

25 条评论

tabbyjabby超过 12 年前
I am <i>extremely</i> doubtful that optimized C is going to be equally performant as optimized JavaScript. There is innate overhead when using an interpreted language, no matter how advanced the interpreter. JavaScript is also garbage collected, while C is not, adding an additional level of overhead.<p>At no point in this article are we shown the code of this new parser. We are also told that is incomplete. So we have a parser which we can't see and which is not finished, but apparently dominates its C counterpart in performance. This leads to me to believe one of two things:<p>1. The parser isn't complete, and its unimplemented functionality is going to be more expensive in terms of performance than the author anticipated, thus rendering his preliminary results void.<p>2. The implementation of the C extension he is comparing against is not very well optimized. As said above, I find it <i>very</i> hard to believe that well optimized C is going to be beaten by well optimized JavaScript.
评论 #4624602 未加载
评论 #4624550 未加载
评论 #4625119 未加载
评论 #4626576 未加载
评论 #4625187 未加载
评论 #4626162 未加载
评论 #4626344 未加载
snprbob86超过 12 年前
I love the "eval is awesome" section.<p>It's taken my <i>years</i> of study, but I finally "get" Lisp. Let me summarize pg's On Lisp:<p>"Lisp is a language for writing <i>programs fast</i>"<p>1) Write a function<p>2) Write a function that does something similar to #1<p>3) Begin to write a function that does something similar to #1 and #2<p>4) Abstract out the commonalities between #1 through #3<p>5) Encounter all kinds of new use cases for your abstraction in #4<p>6) Wait for #5 to become unwieldy<p>7) Abstract #4 even further, until you have something that resembles an interpreter<p>8) Wait for #7 to become a performance bottleneck<p>9) Write a compiler that takes inputs to #7 and turns them into #1 through #5<p>"Lisp is a language for writing <i>fast programs</i>"
评论 #4624995 未加载
mkaufmann超过 12 年前
For me myself it is often very motivating (sometimes demotivating) if somebody comes up and implements the same thing I want to do - but they accomplish 10x faster performance (even better if it is written in a cleaner style). The author seems also to rather enjoy those challenges ;)<p>In my interpretation the question "Faster than C?" is relevant because it is very easy to think that certain problems (like parsing binary data) are just not a good fit for language Y and thus the performance can't be good. Even if C performance is not attainable sometimes one cat get closer than one would think.<p>My last question is regarding his optmization example.I don't program javascript and thus have difficulties understanding how the optimized variant could every be faster than the original one. String concatenation and evaluation should not be faster (from my laymans perspective) than setting an indexed value to an array? (Ok actually it probably is not an array but a hash map, but are those so inefficient in java script? Or is it rather behaving like a sorted list where every value is inserted via binarcy search?).I would be very happy to get some insight.
评论 #4624223 未加载
PommeDeTerre超过 12 年前
The excessive use of "fuck" and "shit" throughout the entire article really takes away from its message. It just seems very immature. It's hard for me to take it seriously when it sounds like it was written by an angsty teenager who's trying to look tough in front of his friends, or something like that.
评论 #4624158 未加载
评论 #4624549 未加载
评论 #4624321 未加载
评论 #4624442 未加载
评论 #4625236 未加载
评论 #4624202 未加载
eric_bullington超过 12 年前
This is an outstanding overview of how to build fast programs in JavaScript. I really appreciate the fact that Felix took the time to write it. I learned a lot, and will definitely integrate this into my work whenever performance-critical code is involved. Particularly since I enjoy dataviz so much.<p>I'm a little surprised that someone so into JavaScript is using R and ggplot2 to visualize his data, and not d3 along with CoffeeScript for munging data (or Python if you can't stand CoffeeScript). Don't get me wrong, R is a powerful tool, but with d3 you can skip the whole ImageMagick/Skitch step since you're making visualizations directly for the web. Plus, once you've grasped d3's declarative approach (took me a while), it's so easy to quickly make powerful visualizations with it. In fact, it was partially inspired by ggplot2, if I'm not mistaken.<p>And to quickly munge/clean data, I've found CoffeeScript does very well here, similar to how easy it is to use Python for this task. I wouldn't want to write out JavaScript when rapidly trying to get data in the right format for visualizations, but with CoffeeScript and a functionally-oriented library like underscore, it's pretty easy.<p>That said, I'm sure once you've mastered such an intricate tool as R, it's hard to give up that power. But if you're a node devotee and you're looking for a good tool in JavaScript to visualize data, you can't get much better than d3. I know a lot of dataviz folks are learning JavaScript just so they can have access to d3.
评论 #4625561 未加载
评论 #4630281 未加载
masklinn超过 12 年前
&#62; But, life is never this easy. Your code may not be very profilable. This was the case with my mysql 0.9.6 library. My parser was basically one big function with a huge switch statement. I thought this was a good idea because this is how Ryan's http parser for node looks like and being a user of node, I have a strong track record for joining cargo cults. Anyway, the big switch statement meant that the profiler output was useless for me, "Parser#parse" was all it told me about : (.<p>FWIW this is also a case of "get better tools". Line profilers exist, and they can handle that kind of cases (though the instrumentation costs go up likewise).
评论 #4624682 未加载
kghose超过 12 年前
I write code primarily to analyze data from my experiments, and I do it primarily in Python. I have been told (and have read) repeatedly, that optimizing is to be done right at the end and only to optimize the inner most loop etc.<p>From my reading, this guy is advocating the exact opposte: optimize as you code. It sounds sensible, but what to other people who have to optimize for a living have to say?
评论 #4624222 未加载
评论 #4624334 未加载
评论 #4624206 未加载
latchkey超过 12 年前
I really like the style of this writing. Thanks, I learned a bit. I just wish the presenter dug deeper into the final mystery to round out the entire presentation.
willvarfar超过 12 年前
I think that the drivers can do a lot more to improve DB performance. <a href="http://williamedwardscoder.tumblr.com/post/16516763725/how-i-got-massively-faster-db-with-async-batching" rel="nofollow">http://williamedwardscoder.tumblr.com/post/16516763725/how-i...</a><p>At low load, the DB worker is idle when a request arrives and it can be dispatched immediately. But under load, the DB worker is busy when requests arrive and they build up as a backlog.<p>When a backlog builds, the DB worker can examine the pending requests and combine those that it can to reduce the total number of requests.<p>Not all requests are combinable and there can be subtle rules and side-effects. It is likely that a driver with just a modicum of combining ability would be very conservative e.g. simply combining single-key selects that are queued adjacently or such.<p>Even still, the gains can be massive and performance never worse.<p>(Oh, equally applicable to NoSQL too.)
faragon超过 12 年前
No way. Even with equal-efficient code, there is an additional point that is often ignored: Data structures are way more simple in C, with less overhead, and with higher cache-hit ratio, because more data fits in CPU data cache.<p>As example, gcj is as fast as g++ for code generation (both generating machine code, without virtual machine involved), however, in benchmarking appears to be slower, just because the data structures. If you tweak the Java code for use simpler data structures, e.g. integer or byte arrays, data overhead gets reduced, so memory cache hit increases, thus keeping similar performance to C/C++.<p>Other cases, like Java bytecode running on a VM, even with great JIT like HotSpot, suffers also <i>a lot</i> because of data structures, so even when generated code is quite decent (runtime profiling, etc.), penalty is there, so code will suffer great penalty unless running with huge L3 cache (e.g. 32-48MB), being noticeable anyway no matter how much cache you add when having to do many memory indirections more.<p>And of course, when comparing, you have to compare equivalent things, e.g. Java &#60;-&#62; C/C++ ports, and not completely different software with different implementation (e.g. optimized built-in string handling vs non-optimized C string handling -e.g. ASCIIz string handling is slow, because of stupid C string function implementation, not because the C language itself, being the reason of C strings not being used for high performance code, even when writting in C-).
Groxx超过 12 年前
<i>Excellent</i> read. Written well, convincing examples of why it's good advice, and doesn't waste space in making its point.
agentultra超过 12 年前
I guess the hope of high-level languages is that you can build a more interesting vocabulary from complex concepts.<p><a href="http://www.pvk.ca/Blog/2012/08/27/tabasco-sort-super-optimal-merge-sort/" rel="nofollow">http://www.pvk.ca/Blog/2012/08/27/tabasco-sort-super-optimal...</a><p>It seems that there's a lot of effort going into making languages, "faster than C." Perhaps we would all be better off working with languages that just give us better abstractions -- user vocabularies. Layers on layers.<p>Don't get me wrong, C is the right language and a good tool for many situations. It's just that if you're going to extend the vocabulary then why do you have to change the implementation? If you want to optimize why optimize the compiler at such a low level?<p>Either way I just want to say that it's an interesting conversation and I'm looking forward to seeing more.
tantalor超过 12 年前
My only problem with this article was that it didn't once mention parsing binary data, except in the title.
sil3ntmac超过 12 年前
Very worthwhile and well-designed writeup! I do wish that he had expanded the last argument to show what/how he fixed the performance problem. Also, does anyone know <i>why</i> eval is faster than just defining the function? I mean wtf, why doesn't v8 make this optimization?
评论 #4624554 未加载
评论 #4624526 未加载
pmiller2超过 12 年前
I'll just leave this here: <a href="http://c2.com/cgi/wiki?FasterThanCee" rel="nofollow">http://c2.com/cgi/wiki?FasterThanCee</a>
username_taken超过 12 年前
This is only part of the picture. This version of the driver is much slower than the driver based on libmysqclient. See the benchmarks at the bottom. It's a more real world test combining both reads and writes.<p><a href="https://github.com/mgutz/mapper" rel="nofollow">https://github.com/mgutz/mapper</a>
overbroad超过 12 年前
"Faster than C" seems to imply that the author is not seeing the true value of C. It is more than just speed.<p>You can only build so much with Javascript.<p>With C, the possibilities are limitless.<p>For a specific task like parsing, use whatever language you want. But please do not believe that by knowing Javascript you can both dismiss C as an optimal language and that you can build anything. You can't, as to either. As it stands, by relying on Javsacsript you're restricted to a browser or Node.js and whoever controls the browser and Node.js effectively has final control over your opportunitities. What's the browser written in? Javascript? What is Node.js written in?<p>Who cares if your parser is faster than C? If it's "fast enough", that's all that matters. Users of big, complex, graphical browsers or mySQL databases are well accustomed to slow speeds. They have learned how to wait.
评论 #4626986 未加载
philhippus超过 12 年前
An equivalent claim: Faster than Assembly? Parsing binary data in JavaScript. Pah.
chmike超过 12 年前
Check the code : <a href="https://github.com/felixge/node-mysql" rel="nofollow">https://github.com/felixge/node-mysql</a><p>Note: it uses node_buffer.cc which is some C++ code in node. So it is not exactly pure Javascript.
afhof超过 12 年前
Why does everyone keep saying Doing X in Language Y is faster than X in Language Z?<p>Really they are comparing compilers, not languages. Everyone seems to just keep making potshots at Language Z.
goggles99超过 12 年前
Please stop with the X higher level language can be faster than C unless you can challenge the limitations imposed by the laws of physics. Not which compiler is better. I can always find a better or worse C or JS compiler/JITter so that proves nothing.<p>Doing more (Which all more highly abstracted languages currently do) in less time with all else being the same is not possible as we understand quantum physics today.<p>This title is clearly linkbait...
评论 #4625244 未加载
camus超过 12 年前
INFLAMMATORY headline for self-branding and publicity , doesnt go much further. Yes javascript can be fast but in the end , having to write evals to gain performances demonstrate that there is something really wrong with that langage...
评论 #4625897 未加载
goggles99超过 12 年前
Um every fast and decent JS engine IS WRITTEN IN C!!!<p>I refuse to click on anything as ridiculous sounding as "Faster than C? Parsing binary data in JavaScript".<p><i>JavaScript will NEVER be faster than C.</i> If you personally test it and it is - all that you have really proven is that the C compiler you are using was horribly written and is probably 20+ years old. The level of ignorance that programmers have about low level languages and the performance cost of abstractions amazes me sometimes.
评论 #4624624 未加载
评论 #4624645 未加载
andoriyu超过 12 年前
And not a single word about DTrace?
sneak超过 12 年前
Thousands of words about optimizing the reinvention of a wheel.<p>I thought we were hackers? We use libmysqlclient and stfu because we don't give a shit what language a client library is in because it WORKS and isn't unacceptably slow and LETS US SHIP.<p>I really dislike this whole crowd and attitude.
评论 #4624339 未加载
评论 #4624298 未加载
评论 #4624374 未加载