As someone who almost exclusively uses Julia for their day-to-day work (and side projects), I think most of the author's thoughts about Julia are correct. I think the language is great, and using it makes my life better. There are some packages that are actually better than any of their equivalents in other languages, in my opinion.<p>On the other hand, I've also got a higher tolerance for things not being perfect, I can figure things out for myself (and luckily have the time do so), and I'm willing to code it up if it doesn't already exist (to a point). Naturally, that is not true for most people, and thats fine.<p>The author isn't willing to take the risk that Julia won't "survive", which is fair. Its definitely not complete yet, but its getting there. I am confident that it will survive (and thrive) though, and continue growing the not-insubstantial community. I have a feeling the author will find their way to Julia-land eventually, in a couple of years or so.
And there is nothing wrong with C++. For linear algebra I use the armadillo library and it's really a nice wrapper around LAPACK and BLAS (and fast!). For some reason scientists are somewhat afraid of C++. For some reason you "have to" prototype in an "easier" language. Sure, you can't use C++ as a calculator as opposed to interpreted languages, but I see people being stuck with their computations at the prototyping language and eventually not bringing it to a faster platform.<p>Point being: C++ is not hard for scientific calculations.
I switched from mostly using R to Python about a year ago for gluing together my data pipeline (from data source all the way to production models and frontends/visualizations). It hasn't really impacted what I'm capable of doing or my productivity, except the standard extra googling that comes in the first couple years I use any language.<p>The main reason I went for Python is purely practical: it's a language people outside my team will respect and deal with. It makes it easier for me to collaborate in many different ways: share tools with other teams, transfer ownership of my code, get help when I need it, etc. Data science at some companies has the reputation of "hack something together and throw it over the wall for someone else to deal with". In my experience R only furthers this reputation. Which is too bad, it's really great at what it does.
Octave/Matlab are "great" but good luck trying to integrate them into a production web application. Since you cant really do that - avoid using them unless you are fine with implementing the same algorithm twice. Matlab licenses cost money also, and the toolboxes cost additional money.<p>R is useful because there are a lot of resources as it has been along for so long and is used by a large portion of the stats community. It also has a lot of useful libraries that have not been ported over to other languages yet (ggmap!!!). But you still still run into the same problem that you cannot integrate R into a production web application.<p>I am pretty sure Hadoop streaming does not support R,Octave, or Matlab either
I just completed the Coursera data science track which took me from a complete R newbie to being at least somewhat proficient. Having previously used Python for a quite a bit of web programming, I disliked R at first except for its power in statistical programming. But I've since discovered a number of great R packages that make it a pleasure to use for things I would normally turn to Python for. Like I recently discovered the rvest package for webscraping.<p>Data visualizations with R seem vastly superior, unless I am missing something with Python (highly likely). And putting up a slick statistics app is easy with shiny or RStudio Presenter. But R can't really scale to a large production app, isn't that right?<p>So I feel I need to keep working with both Python and R.<p>Added: That's a nice list Lofkin. Thanks. Also, in the article he says that Python syntax feels more natural, which I also felt. But then I started to use things like the magrittr and dplyr packages in R which gives you nice things like pipes and that feeling starts to ebb.
>I think it [Perl] is still quite common in the bioinformatics field though!?<p>That's true - many day-to-day tasks in bioinformatics are more or less plain-text parsing [1], and Perl excels in parsing text and quickly using regular expressions. "My" generation of bioinformaticians doing data cleanup and analysis (20-30) uses Python, sometimes because plotting is nicer, the language is easier to get into, it's more commonly taught in universities, or other reasons - people older than that normally use Perl.<p>Both BioPython and BioPerl are extremely useful.<p>[1] Relevant quote from Robert Edgar: "Biology = strcomp()"
from <a href="https://robertedgar.wordpress.com/2010/05/04/an-unemployed-gentleman-scholar/" rel="nofollow">https://robertedgar.wordpress.com/2010/05/04/an-unemployed-g...</a>
Andrew Ng said in the Coursera Machine learning class that according to his experience, students implement the course homework faster in Octave/Matlab than in Python.<p>But yes, the point of that course is to implement and play around with small numerical algorithms, whereas the linked blog is about someone who mainly calls existing machine learning libraries from Python.<p>Ref. <a href="https://news.ycombinator.com/item?id=4485877" rel="nofollow">https://news.ycombinator.com/item?id=4485877</a>
Quite interesting post. I feel that a lot of the numerical Pythonistas are in the same spot:<p>They tolerate most languages, but find R's syntax a bit unnatural, Matlab lacking when trying to go beyond pure matrix stuff, and are waiting to see if Julia picks up (which it seems to be from what I can tell)
From the perspective of a student, most of the good online analytics/data analysis/stats courses use R, so it is hard to get away from it while learning the material. Once you get the base concepts down, switching to python shouldn't be hard. I think most people still prefer ggplot2 for visualization though. Whenever I use R I feel like a statistician, I can feel that 'cold rigor' emanating from the language. But in the end I think it is advantageous to wield both languages.<p>Also I really see Jupyter as a new standard for communication. Your narrative and supporting code all in one place, ready for sharing.
Personally I'm tempted to make the switch to Julia, but slow higher order functions, high churn in the core data infrastructure and no Pymc 3 are keeping me on pydata for a bit longer. I have numba to hold me over.
One thing missing here: Matlab syntax is actually very close to modern Fortran. At least twice I've written Fortran code (for Monte Carlo simulations; different contexts) by overwriting Matlab code adding types / general verbosity / fixing the syntax of do-loops / etc.
I love the hacking approach in the post: a tool is only a tool to do something valuable and not the goal itself. The Python ecosystem is the right tool at the right time, nowadays, because of the data science explosion and the need to interact very quickly with non-specialists.