TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

I use Nim instead of Python for data processing (2021)

95 点作者 archargelod8 个月前

11 条评论

jillesvangurp8 个月前
Python popularity is self re enforcing. It&#x27;s popular because it&#x27;s popular. Not because it is fast or particularly good at anything it does. It arguably isn&#x27;t that great for a lot of things people use it for. A lot of that popularity just stems from people already knowing it and the existence of lots of libraries and tools. It&#x27;s a bit the Visual Basic (if you are old enough to remember that being popular) of the data science world. Lots of people use it that aren&#x27;t necessarily very good software engineers or even focused on that.<p>It&#x27;s not my first choice for a lot of things but I can work with it when needed and it gets the job done. I&#x27;m currently doing a project on the side in python for one of my clients and I&#x27;m getting familiar with some new things like async and a few other things. It&#x27;s fine.<p>Nim, Mojo, and a few other python dialects kind of point to the future for python itself as well. The recent decision to get rid of the GIL is a step in the right direction and there has been incremental progress on interpreter and jit internals as well in recent releases. It might never get quite as fast but it can get a lot closer to those.<p>And having some new options to implement stuff in that needs to be fast other than C is probably a good thing. A lot of popular python libraries don&#x27;t necessarily contain a lot of python code. You see the same in the javascript world where there&#x27;s a lot of WASM creeping into the space lately.
评论 #41463730 未加载
评论 #41463906 未加载
评论 #41463915 未加载
评论 #41468692 未加载
评论 #41463872 未加载
评论 #41471629 未加载
评论 #41469299 未加载
评论 #41463979 未加载
评论 #41463229 未加载
评论 #41463445 未加载
in_ab8 个月前
Nim is fast, powerful and has a lightweight syntax. I used it for a lot for hobby projects. I wrote program in Nim that started bringing in some money. But soon as feature requests from customers started coming in, I had to rewrite it.<p>The tooling for Nim and library ecosystem is just not there. It&#x27;s so much more productive to work with Python, .NET or JVM. I decided to rewrite it in Kotlin because JVM gave me similar performance.
评论 #41478489 未加载
big-green-man8 个月前
I like nim, though I agree that the toolset isn&#x27;t there yet. However I have encountered an interesting quirk with nim that kind of drove me crazy.<p>In a lot of languages, string types are syntactic sugar for arrays or lists of chars. And usually, when you try to parse a string as a list, because it&#x27;s just syntactic sugar, it works flawlessly. In nim it is also true that strings are just lists of chars, but for some reason the compiler will not allow you to treat it as such! It seems to have all kinds of special corner cases about how you can do one thing or another that behave differently. It doesn&#x27;t really seem to have a holistic fundamental design or form, it feels to me like just a bunch of stuff slapped together.<p>But if you can get to know the quirks, it&#x27;s incredibly powerful, if for no reason alone, that tooling exists to transpile nim into <i>anything</i>. Well, almost. A core part of the design philosophy is to leverage as much existing tooling as possible, and so it inherits this property and enables you to compile nim for just about any architecture and into any language. This to me is incredibly powerful.
评论 #41464071 未加载
评论 #41463833 未加载
dang8 个月前
Discussed at the time:<p><i>Why I Use Nim instead of Python for Data Processing</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=28626947">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=28626947</a> - Sept 2021 (179 comments)
bionsystem8 个月前
&gt; import bioseq # my library, has k-mer iterator and FASTA parsing<p>I&#x27;m not a developer, but isn&#x27;t that a bit unfair ? If one is going to rewrite all libs instead of just importing them from python, one is trading convenience for speed.<p>That said it&#x27;s also mentioned that nim is compatible with python, does that mean it can import python libraries directly ?
评论 #41463902 未加载
评论 #41464255 未加载
elashri8 个月前
&gt; so it’s usually not worth spending a ton of time optimizing single-threaded performance for a single experiment when I can just perform a big MapReduce.<p>Is this the scientific version of &quot;rich people problems&quot;?<p>But I have a problem with talking real life application and using that to claim<p>&gt; it will be impossible for pure Python to beat pure Nim at raw performance<p>Because it maybe be true. I didn&#x27;t try it but there are many things that can be optimized in the python code example but away from that. In real life application in scientific computing I don&#x27;t think anyone wouldn&#x27;t use numpy to deal with that which will make things much better. Also the power of python in data analysis and scientific computing is the ecosystem and community. This will be very hard to beat. And there are more mature alternatives like Julia.<p>Edit: The author code for reading the data reading is creating a new file object for each iteration. I would guess that in nim this would be a similar problem but I am not sure how it actually work or if has the same effect. But anyway you don&#x27;t do this in real life application with python. Also it would be nice to use a list comprehension to count the occurrences of &#x27;C&#x27; and &#x27;G&#x27; in each line.
评论 #41466537 未加载
评论 #41463288 未加载
mg8 个月前
The optimizations the compiler can do because of the &quot;var&quot; the author added in the nim version of the first example should be also possible without it. Because Python defines variables as local or global at compile time.<p>The code looks like they avoided that by putting the code into the global scope?<p>If the code of the first example is what the author really ran, than he got a speed penalty for running the loop in the global scope.<p>One might consider it a bit of a quirk that Python code runs slower in the global scope. But in practice, it rarely matters. As a script with just a loop and no functions (not even a main function) is so rare.
julianeon8 个月前
I don&#x27;t think the advantage here is so much that Nim is fast, as Python is slow. If you&#x27;re willing to dump Python you have many compiled language options, but I&#x27;ll pick two: C and Rust.<p>For the kind of tasks the author outlines, I&#x27;d use AI. It excels at this: these are really simple, well-defined tasks it won&#x27;t screw up.<p>So what I would do is pick a faster language - I&#x27;d pick Rust - then ask AI to script it and then repeat for as many tasks as you need.
评论 #41466435 未加载
fithisux8 个月前
Is Nim supporting vectorization? Is Nim having x64 intrinsincs?
评论 #41463096 未加载
nick__m8 个月前
while unexplored in this article, the Nim type system is pretty nice too, particularly the subrange type, the enum and the object variants.
评论 #41466457 未加载
mfld8 个月前
This example will highly benefit from JIT compilation, as it is already possible in cpython with the @jit decorator of numba. I assume the time benefit of Nim will eventually fade away, while you still have the benefit that every dev can understand python.
评论 #41463355 未加载