What's Next for R?

189 点作者 carlosgg超过 5 年前

12 条评论

meztez超过 5 年前

I would highly recommend the use of the package data.table over tibble or the basic data.frame if you are doing any type of modeling in R with larger datasets. Yes R has many data structures but knowing how to use data.table will blow your mind in term of efficiency. Matt and other contributors have built something extremely fast and flexible.I get that R is not for everyone but used correctly it is a beast.Now this is anecdotal, but we have in the insurance industry what we call on level premium calculators. It is basically a program that will rerate all policies with the current set of rates.Our current R program can rate 41000 policies a second fully vectorized on a user laptop that has a an i5 from 2015.In contrast, the previous SAS program could do 231 policies a minute on xeon 64 core processor from 2017.For our workload and type of work, R has been a godsend.Bonus, we can put what our data scientist develop in R directly in production. (after peer review, testing, etc, not different than any other production code)Back when I started in 2005, we modeled in some proprietary software like Emblem, used Excel to build a first draft premium calculator, rebuilt the computation in SAS for the onlevel program and sent specs to IT to rebuilt the program again for production. All three had to produce the same results.I've tried Python, Go, Rust, Julia. I'd say Python could be a good alternative but speed of data.table, RStudio IDE and ease of package management in R makes R an obvious choice for us. I believe Julia to be the future but so far the adoption rate in house has been low.

评论 #21896929 未加载

评论 #21896068 未加载

评论 #21896461 未加载

评论 #21896759 未加载

RA_Fisher超过 5 年前

I'm so thankful for R, it's community and their great libraries! I've built a eight year (so far) career in data science using R to model data and perform experiments. I love R's functional programming style / dplyr which makes manipulating data a delight. ggplot2 is such a great plotting library, well worth the investment to learn. Then there's all the stats tools like glm, MASS, through brms for advanced Bayesian analysis (<a href="https://github.com/paul-buerkner/brms#brms" rel="nofollow">https://github.com/paul-buerkner/brms#brms</a>). With R and Python, it's a great time to be a statistician-programmer!I recommend folks looking to start with R check out: <a href="https://r4ds.had.co.nz/" rel="nofollow">https://r4ds.had.co.nz/</a>

评论 #21895966 未加载

latte超过 5 年前

Cannot comment from my personal impressions, as I have almost zero knowledge of R, compared to several years of using Python for writing apps and working with data. I like R's focus on functional programming, though.However, a couple of years ago, my wife tried to transition from business consulting to a data analytics / data science role. She started with taking an R course. She was put off by R's complexity and the course's early focus on the details of R syntax, function definitions, closures etc. and abandoned it.The year after, she decided to try again and enrolled in a course that used Python (with numpy+pandas+scipy as data science stack) and she reported it to be much simpler, more intuitive and easier to learn compared to her previous experience with R. Now she has successfully completed the program and is employed as a data analyst.

评论 #21896814 未加载

评论 #21897760 未加载

评论 #21896573 未加载

评论 #21896627 未加载

glofish超过 5 年前

What's Next for R?Doing the exact same thing we did before!We have a new library called "dtplyr" (no seriously!) it is designed to save users from the arcane and obtuse sides of R by combining the power of "dplyr" and "data.table", the two libraries that were designed to save users from the arcane and obtuse sides of packages such as "data.frame" and ....I wish I were kidding. There is the absurd contention in the R world that by introducing yet another weirdly named package people can avoid having to learn and suffer through the "real" R.

sammm超过 5 年前

I started at a company using Shiny for their applications and R as part of their data pipelines.A huge pain point for us is the packaging system. It is absolutely awful. Packages constantly get overridden so we have to install packages in a specific order. Whenever I have reached out to the community (including prominent members, which have written R books) I have always been told to just use the latest version of all packages and just get on with it, which as anybody knows, isn’t always possible, especially as there are constantly breaking API changes.I understand R’s history and that in general, it is a lot better than it use to be, but I would only recommend R is used for notebook style work and to keep it well away from production.We have migrated to Python, which isn’t perfect, but the difference in logging and packaging has been night and day.

评论 #21898579 未加载

评论 #21905488 未加载

评论 #21898860 未加载

bransonf超过 5 年前

Disappointed in the lack of discussion of R-Shiny or Plumber.R-Shiny is a full stack platform for web apps, and it’s how I leveraged my data science background to get into web development. It’s incredibly powerful in my opinion, with the only obvious limitation being the speed of R itself.And Plumber. It’s become the defacto method for deploying R code in a REST api. It too is still maturing, but I see it eventually becoming the Flask of R.Truth be told, however, after developing quite a few projects on the Shiny/Plumber stack, I wouldn’t recommend anyone do it.If for some reason you can only have an R interpreter, go for it. But learning multiple languages really is the best solution if you want to manage efficient applications. I say this, however, realizing that all of my colleagues writing R don’t have engineering backgrounds.I can’t help but feel like R is like JavaScript in many ways. Ease of use and the ease of publishing packages very quickly clutters the repository.R will always have a special place in my heart, after all it’s the language that made me discover programming. However, I can’t help but feel that my thirst for efficiency is making me outgrow it as a language quickly.

评论 #21901742 未加载

评论 #21897920 未加载

thegginthesky超过 5 年前

When I used R in University (majored in Applied Mathematics and Statistics) I was always awestruck at how every sort of novel modeling technique from GLM, to Beta Regressions, to GARCH, is all easily accessible for free, with proper academic paper and documentation, and with a cohesive standard support.It was really useful to be able to apply most theory I was learning to actual research datasets. This is what I miss the most since moving to Python.What I don't miss is R's terrible packaging system and how it made collaborating with colleagues near impossible. I can't count the amount of times I had to debug dependencies on others' script just to be able to move forward with some team project.

评论 #21896835 未加载

评论 #21896841 未加载

luhego超过 5 年前

I used R when I took an online course on Data Analysis. I didn't like it at all. Its syntax is weird and painful to read. The only nice things about R are Tidyverse and ggplot. I found Python to be a better alternative. You can use Pandas for data analysis y EDA. Matplotlib and Seaborn for plotting. Scikit-learn for training your models. An additional benefit is that Python is a general purpose language that you can use to build a complete application.

评论 #21896277 未加载

xvilka超过 5 年前

This currently missing are better LSP (Language Server Protocol)[1] (it supports only some of the LSP features), better linter[2] and static analysis, better integration with GitHub[3], and so on. More on the tooling side, I believe.[1] <a href="https://cran.r-project.org/web/packages/languageserver/readme/README.html" rel="nofollow">https://cran.r-project.org/web/packages/languageserver/readm...</a>[2] <a href="https://github.com/jimhester/lintr" rel="nofollow">https://github.com/jimhester/lintr</a>[3] <a href="https://github.com/github/semantic/issues/382" rel="nofollow">https://github.com/github/semantic/issues/382</a>

tzabal超过 5 年前

I also got excited when I found out about R Markdown, and how well is integrated with RStudio. I believe that it is a decent alternative to Jypyter Notebook.

roel_v超过 5 年前

I hope a hospice. Ugh that language has damaged me worse than Perl.

pickdenis超过 5 年前

I know this is a dead horse, but I think R seriously shot itself in the foot with its data structures[1]. I don't really see a solution for this, as fixing it would never be backward compatible. I'll always pick Python over R because the data structures actually make sense to me as a programmer (objects that look like lists, dicts, matrices, etc. or any combination of the above, and they all behave in very predictable ways). I think this puts off a lot of other people like me.[1]: <a href="https://jamesmccaffrey.wordpress.com/2016/05/02/r-language-vectors-vs-arrays-vs-lists-vs-matrices-vs-data-frames/" rel="nofollow">https://jamesmccaffrey.wordpress.com/2016/05/02/r-language-v...</a>

评论 #21895941 未加载

评论 #21898454 未加载

评论 #21895916 未加载

评论 #21895807 未加载

评论 #21895902 未加载

评论 #21896771 未加载