TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The counterintuitive rise of Python in scientific computing (2020)

220 点作者 leonry大约 3 年前

33 条评论

photochemsyn大约 3 年前
Python displaced a lot of very expensive proprietary software in the biosciences arena. Ease of use was also a major factor, as many bioscientists have relatively little background in programming, but the ability to escape the world of expensive restrictive software licenses was very attractive to the scientific community, whose historical norms emphasize the open sharing of methods and results:<p>&gt; &quot;A program that performs a useful task can (and, arguably, should) be distributed to other scientists, who can then integrate it with their own code. Free software licenses facilitate this type of collaboration, and explicitly encourage individuals to enhance and share their programs. This flexibility and ease of collaborating allows scientists to develop software relatively quickly, so they can spend more time integrating and mining, rather than simply processing, their data.&quot;<p><a href="https:&#x2F;&#x2F;journals.plos.org&#x2F;ploscompbiol&#x2F;article?id=10.1371&#x2F;journal.pcbi.1004867" rel="nofollow">https:&#x2F;&#x2F;journals.plos.org&#x2F;ploscompbiol&#x2F;article?id=10.1371&#x2F;jo...</a><p>Now there isn&#x27;t any area of molecular biology and biochemistry that doesn&#x27;t have a host of Python libraries available to assist researchers with tasks like designing PCR strategies or searching for nearest matches on up to x-ray crystallography of proteins.
derbOac大约 3 年前
C, C++, Fortran are still used, most Python users just don&#x27;t see it because it&#x27;s hidden away underneath the calling function.<p>I&#x27;ve been surprised by the rise of Python in some ways although not at all in others. Languages like C, C++, Fortran, and dare I say it Rust are too low-level in their raw state for numerical computing. You had the US federal government funding language competitions because of this (see: Chapel). Languages like Python and R (and before that things like Lisp) came along and gave people a taste of something different, and it&#x27;s obvious what people migrated to.<p>Part of it is timing: multivariate computational statistics (ML&#x2F;data science&#x2F;DL&#x2F;whatever you want to call it) just sort of started taking off in computer science communities before LLVM-based languages like Julia or Nim could get a foothold. OCaml might have fit that niche but never got there because of a desire to take a different path, or take the path more slowly.<p>So people looked for a nice expressive language, found it in Python, and buried all the messy stuff behind wrapper functions and called it a day. It was furthered along by Matlab being another comparison on the other side -- Python looks kludgy compared to modern Fortran or C, but not compared to Matlab.<p>All that wrapper time in Python has its costs, so I suspect as limits get pushed further we&#x27;ll eventually see a migration to something else like Julia or Nim, or something else not on anyone&#x27;s radar.<p>One moral to this story is that expressiveness matters. People will go out of their way to avoid talking directly to machines at a low level.
评论 #30814284 未加载
评论 #30813953 未加载
评论 #30813977 未加载
评论 #30813793 未加载
评论 #30814013 未加载
评论 #30817462 未加载
评论 #30815171 未加载
评论 #30813587 未加载
评论 #30815608 未加载
评论 #30814134 未加载
scythe大约 3 年前
As a physicist, having spent eight years in academia, Python did not win by beating Fortran. Nor did it beat C++. It didn&#x27;t really compete with Ruby or Lisp, although Lua (Torch) was a briefly serious competitor before everyone realized that a language developed by four people, one of whom doesn&#x27;t get along with the others, couldn&#x27;t be responsive to users&#x27; needs.<p>Python defeated Matlab. I know because I cheered it on. I was there. I watched my roommates and friends struggle with introductory scientific computing in Matlab and I joined the chorus that was practically begging for Python, even though I didn&#x27;t really like it. I can&#x27;t even begin to explain how awful it is to try to teach programming concepts in Matlab. But something like Python or Matlab had to be the choice because the schools wanted to teach programming through a language where you could just call &quot;graph&quot; and the computer would display a graph.<p>Python&#x27;s team, unlike Lua&#x27;s, aggressively courted educational institutions by offering scientific, numerical and graphical libraries within a programming language that works like a programming language, not a glorified computer algebra system. They even added a dedicated operator for matrix multiplication. It&#x27;s a great example of finding a niche and filling it: I still don&#x27;t like <i>using</i> Python, but I can&#x27;t dispute that no other language&#x2F;ecosystem comes close to offering what we need to teach programming to physics students.<p>You want to beat Python? Build a type system that can capture dimensional analysis. Warning: it won&#x27;t be easy.
评论 #30813184 未加载
评论 #30812893 未加载
评论 #30813668 未加载
评论 #30813034 未加载
评论 #30816851 未加载
评论 #30814091 未加载
评论 #30813862 未加载
评论 #30812975 未加载
elil17大约 3 年前
It’s amazing how often the authors point of “agility” arises in real world circumstances. I’m not a programmer, but I use Python a lot in my engineering job. There have been 3 times in the past month where I got an order of magnitude speed up because SciPy implements a very complex but highly efficient algorithm which I would never have had time to deploy.
评论 #30813943 未加载
iainctduncan大约 3 年前
If you know actual scientists, this isn&#x27;t counter intuitive at all. My partner is a scientist, so now I know tons of them, and I have done a bunch of Python coding and support for scientists, have been a Python programmer (as well as other languages) since 2005-ish. I saw this coming (as did many) 15 years ago.<p>Most scientists, and their grad students, are trying to do a whole bunch of things in their research, and programming is just one of them. Field work, experiments, data wrangling, writing papers, defending papers, teaching, etc. And most of them do not have access to budgets for programmers or when they do, it&#x27;s for a limited amount of time and work, meaning they need to be able to pick up and run with whatever the programmer did. So the fact that with Python they and their grad students (who might be there for only 2 years) can be working productively, and figure out what the hell the code did when they come back to it months later, is HUGE. As in, literally blows every other consideration to smithereens. This has meant that over the last 20 years the scientific libraries in Python got mature faster than in any other language, and this in turn has had a snowball effect. And when speed is necessary, C++ extensions can be written. But honestly, most of the time speed is not the main factor.<p>The downside of Python in my experience is that junior teams can make heinous atrocities when a project gets really big (I have had to step in as CTO to one of those messes, so much as I love Python, I must admit this is true!) But the stuff the scientists are doing is very rarely that big. It&#x27;s tools programming, scripting, making utilities, data analysis and so on.<p>Readability counts. In some fields, it counts more than anything. I&#x27;ve worked in about 10 languages now over the last 20 years, and Python is still the easiest to read when you come back to some old code or have to pick up code for a small job, or hand it to a beginner to extend without having them create an unreadable mess. This is what scientists need to do all the time.<p>Re other people&#x27;s comments on Python packaging and setup being hard, well honestly I&#x27;ve had just as much pain with Ruby or Node. The shining exception there is R, which is giving Python a run for its money in many scientific areas. R Studio has the best &quot;hit the ground running&quot; experience out there and is really slick for data programming.
评论 #30812938 未加载
评论 #30815796 未加载
评论 #30813632 未加载
评论 #30814366 未加载
评论 #30815670 未加载
jrochkind1大约 3 年前
As a rubyist, it makes me sad that python ended up here rather than ruby. And I sometimes wonder why.<p>&gt; As the name suggests, numeric data is manipulated through this package, not in plain Python, and behind the scenes all the heavy lifting is done by C&#x2F;C++ or Fortran compiled routines.<p>So I wonder, was it easier to write C&#x2F;C++ or fortran compiled extensions in python than it was in ruby?
评论 #30813566 未加载
评论 #30813528 未加载
评论 #30813293 未加载
评论 #30813811 未加载
评论 #30814028 未加载
评论 #30813513 未加载
评论 #30815529 未加载
评论 #30812772 未加载
评论 #30814056 未加载
评论 #30812698 未加载
评论 #30812755 未加载
评论 #30813500 未加载
评论 #30815316 未加载
评论 #30812906 未加载
socialdemocrat大约 3 年前
The performance Python is a real problem but Python has succeeded because scientific computing really needs interactive and dynamic programming languages. You need something which lets you easily experiment with data, plot, change code in rapid iterations without constant recompilations and reloading of data.<p>This has been recognized for some time. The compromise had been to build performance sensitive parts in C&#x2F;C++ and do the experimental&#x2F;iteration part in Python.<p>But today you don’t really have to compromise anymore. We got Julia. It solves the whole problem. You get the interactivity you need combined with the performance.<p>Of course in my his industry momentum matters. Python has built up the momentum of an oil tanker. Even if you shut off the engines it is going to keep going for many years.<p>But Julia is the obvious end station. It does all the things HPC and scientific computing needs. But building mains share, documentation, community, polish tools etc will of course take time.
评论 #30817264 未加载
sega_sai大约 3 年前
As many already noticed, the rise of Python is not counter-intuitive at all. (I&#x27;m a scientist myself).<p>Basically modern python offers you a spectrum from easy to understand and quick to write python programs (those will be slow), to purely glue code that connects a lot of high performance c&#x2F;C++&#x2F;fortran code. And many scientists will start from pure python code with the help of numpy. In many cases it will be good enough. But if needed you can always interface with other libraries, or write yourself high performance c&#x2F;c++&#x2F;fortran code for the most performance critical bit, and use python to glue it together. That flexibility where you can trade speed of writing the code with the speed of execution is very valuable.
评论 #30815458 未加载
gaze大约 3 年前
This article has been written a hundred times. &quot;We abandoned a fast language for a language that is slow but can use fast libraries, and so the result is fast. It&#x27;s faster because the programmer discovered existing libraries that do a better job of what they were doing already.&quot;<p>There&#x27;s so many convolved factors here I don&#x27;t even know where to begin, so I guess I&#x27;ll just say that I&#x27;m glad Julia exists. The author glosses over many decades of programming language and compiler research -- which makes sense, because this is not their specialty. However, what I see is the field of scientific computing migrating from a dinosaur language (Fortran isn&#x27;t, actually. It just is used this way) and dinosaur practices of writing everything oneself, to one of the slowest interpreted languages that happens to be the most difficult to JIT, and saying this or that about how interpreted languages are slow but library calls are fast to justify this. At the same time they&#x27;re learning to build a functioning library ecosystem.<p>Basically, grad students are learning proper programming practices and collaboration after switching to a more expressive language, they just managed to pick the slowest and most difficult to optimize one. Maybe they just managed to wipe some of the slate clean by switching away from Fortran and its culture (the culture being the bad part), and the culture of Python filled the space, creating a net positive but somewhat unfortunate situation.<p>Just one more time -- the idea that you can call a &quot;faster&quot; language to do the heavy lifting is true of every language and does not justify the choice of Python in particular. The justification for Python is the momentum, and this is in my opinion the only one.
评论 #30817494 未加载
whatever1大约 3 年前
Julia is the next big thing. I am always blown away by its readability and speed.<p>But it will take years to build a library ecosystem that can rival the python one.
评论 #30815468 未加载
评论 #30813857 未加载
评论 #30814097 未加载
评论 #30814681 未加载
评论 #30814123 未加载
dekhn大约 3 年前
Counter-intuitive? I picked it because it was the closest scripting language to C (see the select and socket APIs for good examples). And it had numeric array support early-on (making it an attractive replacement for matlab).
uoaei大约 3 年前
Python is an API to efficient scientific computing code. It&#x27;s good for that, assuming you&#x27;re using old and more verbose languages.<p>Look into Julia as a promising alternative -- the language itself is superbly fast (aside from initial compilation) and there&#x27;s an impressive scicomp ecosystem to say the least, all written in native Julia. This allows for program rewriting &#x2F; metaprogramming more broadly and is insanely powerful once you get a feel for it.
评论 #30817971 未加载
fancyfredbot大约 3 年前
I feel like python acts like a kind of bus in scientific computing, connecting various high performance libraries and DSLs together.<p>That said, this article&#x27;s story of someone using the wrong algorithm is a bad example in my view. Python hasn&#x27;t succeeded because people are more likely to use more efficient algorithms due to easier experimentation, it has succeeded because the of the size of the ecosystem and the fact such algorithms are easily available.
jackjackk0大约 3 年前
I recommend one of the recent videos by Dave Beazly [1]. He lived through and contributed to the raise of Python in scientific computing first hand in the 90s, and offers some interesting insights. Plus he&#x27;s always quite an entertainer.<p>[1] <a href="https:&#x2F;&#x2F;youtu.be&#x2F;4RSht_aV7AU" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;4RSht_aV7AU</a>
bernulli大约 3 年前
For those unfamiliar, CERFACS (Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique, i.e. European center for research and advanced training in scientific computing) is a leading research institution, with two main branches: meteorology, and engineering computational fluid dynamics. I am not affiliated and can only evaluate the engineering part, their combustion modeling group is one of the best in the world.
Sugimot0大约 3 年前
I feel like nim[0] should replace python as it matures, but I would be curious to hear others perspectives, since mine is mainly based on reading.<p>[0]: <a href="https:&#x2F;&#x2F;nim-lang.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;nim-lang.org&#x2F;</a>
sytelus大约 3 年前
A lot of thanks should go to Oracle. Back in the days Java was go-to language for everything. After Oracle acquired it in 2009, the only respectable languages with good numerical libraries were Python, Julia and R. Unfortunately, Julia’s marketing wasn’t strong enough and R was decisively an ugly thing to work with.
wheelerof4te大约 3 年前
One thing that I don&#x27;t like with Python&#x27;s scientific libraries is how they change the overall Python syntax.<p>There are so many ways to slice an array or a dataframe, and only a few of them are valid Python code.<p>Keeping the language API should have been a priority, but that is a consequence of operator overloading features.
评论 #30814042 未加载
评论 #30814216 未加载
评论 #30815827 未加载
efxhoy大约 3 年前
I wrote scientific python for several years at a university research project, coming from a statistics background. I wrote a forecasting tool and related plotting, simulation, ML, evaluation etc tools.<p>The reasons for python’s success are obviously the ecosystem. Numpy is the foundation. On top we have sklearn, statsmodels, pandas, matplotlib. Before our project most work in the department was done in Stata, a proprietary language&#x2F;tool that works well for some classical regression and stats work but falls apart as soon as things get complicated. Moving to python allowed us, a group of social scientists, to work on some really hard problems.<p>Now we have boosted tree models and other tools that just can’t be used in the old tools like Stata.<p>Python and R run the show in social science.
guidorice大约 3 年前
I am really curious how Zig lang eventually does in scientific computing. It&#x27;s already speedy compiler, language server (zls), and upcoming hot code reloading feature, makes me think that reactive coding and visualization notebooks in Zig should be feasible. Although, Zig has no operator overloading, and no dynamic dispatch though, making it fundamentally pretty different than say, Julia lang. Just as an aside: for my day job, I write Python in a scientific computing (geospatial and ML).
teleforce大约 3 年前
I&#x27;m genuinely surprised that no one here is mentioning D language in addition to Nim or Julia for replacing Python. D has already beaten Fortran in speed more than 5 years back, the legendary scientific programming language that&#x27;s mentioned in the article [1]. The Fortran based libraries that are overcome by the D language apparently are still being used by Python, Nim and Julia for most of their high speed processing until today. As they always said the proof is in the pudding, and compare to all alternative D language is designed to have a similar feel to Python. By default it supports GC for easier and manageable scientific programming that is very attractive for the type A data scientist that are mainly deals with analysis and exploratory programming [2]. The latest D language is now also natively support the C language (lingua franca of scientific programming) in its compiler thus can import and compile C files directly [3].<p>[1] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:<p><a href="http:&#x2F;&#x2F;blog.mir.dlang.io&#x2F;glas&#x2F;benchmark&#x2F;openblas&#x2F;2016&#x2F;09&#x2F;23&#x2F;glas-gemm-benchmark.html" rel="nofollow">http:&#x2F;&#x2F;blog.mir.dlang.io&#x2F;glas&#x2F;benchmark&#x2F;openblas&#x2F;2016&#x2F;09&#x2F;23&#x2F;...</a><p>[2] There are two types of data scientists — and two types of problems to solve:<p><a href="https:&#x2F;&#x2F;medium.com&#x2F;@jamesdensmore&#x2F;there-are-two-types-of-data-scientists-and-two-types-of-problems-to-solve-a149a0148e64" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;@jamesdensmore&#x2F;there-are-two-types-of-dat...</a><p>[3] Adding ANSI C11 C compiler to D so it can import and compile C files directly:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27102584" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27102584</a>
smitty1e大约 3 年前
&gt; Of course, If the best algorithm is known beforehand or the manpower is not a problem, a lower level-language is probably faster, but this is seldom the case in real life.<p>One is wary of one-dimensional analysis of anything in a software context.<p>Who cares if the Fortran library runs like the blue blaze, if it cannot be readily maintained?
评论 #30814609 未加载
oh_my_goodness大约 3 年前
This article expresses the ancient Python(&#x2F;Matlab) v Fortran argument beautifully ... but it&#x27;s kind of shocking that the argument is still going on at all. My generation came out of school happy to use FORTRAN indirectly, via a scripting language, for rapid prototyping. That was 30 years ago.
keskival大约 3 年前
I don&#x27;t think Python displaced Fortran in HPC as much as it displaced Matlab (and Octave) and R in scientific computing.<p>Displacing Fortran was a side-effect of that trend, as now it wasn&#x27;t about productionizing Matlab code into Fortran, but Python could do general purpose computing adequately as well.
hulitu大约 3 年前
I try to love micropython. However, its UI is at ed level. It only says &quot;Syntax error&quot;.
评论 #30814513 未加载
musicale大约 3 年前
I gave up Matlab and never looked back.<p>As the article notes, various numerical kernels have been wrapped as Python compiled modules&#x2F;libraries, and numpy and other systems seem to work OK for many applications.
jurschreuder大约 3 年前
People always give the argument that python calls c++ libraries, but I use both Python and c++ a lot, and writing c++ directly, calling the the same libraries, is way faster.
评论 #30818971 未加载
nanochad大约 3 年前
Python is what has been popular for the last 15 years. Scientists are not programing language geeks, they just use whatever is popular, viable, and established.
amelius大约 3 年前
New languages should always provide bindings to call into Python modules, so you get the immediate benefit of the largest ecosystem on the planet.
StreamBright大约 3 年前
Python is the common scripting language of C, C++, Fortran.
jrm4大约 3 年前
Yet another hardcore programmy type discovers that usability is infinity more important that how many clock-cycles you save.<p>Programming languages aren&#x27;t for computers, they&#x27;re for people.
评论 #30813358 未加载
评论 #30812660 未加载
评论 #30812633 未加载
评论 #30812560 未加载
评论 #30812575 未加载
评论 #30812744 未加载
评论 #30812604 未加载
d--b大约 3 年前
Yeah it’s counter-intuitive, and it’s because it does not make much sense.<p>Slowness is one thing, but the tooling is also clearly subpar compared to languages of the same popularity, the dynamic typing makes things difficult to maintain, the 2.7 vs 3 shit show etc. etc.<p>The very fact that many smart people have been saying for years that Python is a fairly bad tool for data analysis should at least raise some people’s eyebrows. But no, the entire field of data science has decided that it knows better…<p>Good for them.
blunte大约 3 年前
Python won because people who knew math&#x2F;science domains only knew Python (or it was the best they knew). And so they made libraries for Python. And it propogated like many other bad ideas based on ignorance.<p>Python is a miserably bad language for modern times. If you know any of half a dozen other languages, then you understand.<p>There was a good essay, from Paul Graham?, about the ladder of awareness of programming languages. Unfortunately I can&#x27;t find it now.<p>The point is, Python has won and is frankly terrible. It has inconsistent features, but it has an awkward OOP approach (in a time when OOP is finally being recognized as bad itself), as well as seriously lacking basic language features which are only appearing as of 3.9 and 3.10.<p>Frameworks like Django and Django Rest Framework expand on these bad ideas, creating monstrosities which make the PHP code of yore look arguably decent.<p>Sadly, I don&#x27;t think there&#x27;s any way to kill this. The only option is to vastly outperform the Python people and produce reliable, readable, performant solutions in half the time and beat them to market. Perhaps someday they will die off.
评论 #30816128 未加载
评论 #30816054 未加载