科技回声

5 条评论

Wilduck超过 13 年前

I'm fairly familiar with Python, somewhat familiar with R, and fairly familiar with Stata. The scenarios where you're doing OLS analysis can generally be broken in to two categories:1) Exploration of a given data set.2) Automatic collection and analysis of data.In most academic uses of data (1) is the case. Here, if the data is already rectangular, and in a nice format, I would actually prefer Stata. However, as soon as any sort of manipulation is required, or graphing something more difficult than a scatterplot is required I shift to R.R really shines through when you need to do complicated analysis on a fixed data set. I've found that Hadely Wickham's `reshape` and `ggplot` packages are invaluable. They easily produce graphics that are more informative and better looking than any other graphics package I've seen. Additionally R has packages for essentially any statistical analysis that you could want to do.While R is able to pull data from a database, or other places, as soon as you have more dynamic data, you enter into case (2). This is when it might make sense to start using python. But even then I've found python is mostly useful for curating the data so that it can be used by R.[1] <a href="http://had.co.nz/ggplot/" rel="nofollow">http://had.co.nz/ggplot/</a>

评论 #2934336 未加载

评论 #2935298 未加载

评论 #2939853 未加载

dlan1000超过 13 年前

The advantages and disadvantages the author cites seem more pertinent to his own idiosyncratic preferences than to more general features one might look for when doing interactive data analysis. They also seem easy to address. For example, an hour of time spent building a few quick functions would address most of his complaints about Python. I've personally used python, matlab, R and Stata in my research and view the first three as about equally capable. In my opinion Stata is less comparable to the others as it is more a wysiwyg collection of tools and functions. Matlab has good support for large data sets via memory mapping, has mex extensibility for building your own fast functions and is very good for interactive plotting, but doesn't produce publication-quality finals. Python is great for no-niggling fast idea to functioning execution and can push data into matlab is mlabraw. R has well developed stats packages and a huge user base. I disagree with the author regarding documentation for R--maybe he is right for the core, but depending on the package you may have trouble finding documentation beyond a man page. Ggplot is excellent but eccentric.

sudont超过 13 年前

I’m more interested in Python for it’s real-time capabilities. With Python running statistics, I can do more user-facing things with the data, whereas R can do more statistical things with the data. Plus, an arduino-based sensor array interfacing with R sounds shaky.And since Python does web easily: <a href="http://rapache.net/" rel="nofollow">http://rapache.net/</a>

jasondavies超过 13 年前

LOESS is fairly simple to do in Python, or you can find an implementation via Google e.g. <a href="http://www.koders.com/python/fid5A91A606E15507B6823DEC7A059488A6624C4832.aspx?s=sort" rel="nofollow">http://www.koders.com/python/fid5A91A606E15507B6823DEC7A0594...</a>I'd be curious to see an updated comparison with LOESS added to the Python code!

评论 #2934515 未加载

guffwhitehill超过 13 年前

The non-parametric stats functions in R are better than Py

5 条评论

Wilduck超过 13 年前

评论 #2934336 未加载

评论 #2935298 未加载

评论 #2939853 未加载

dlan1000超过 13 年前

sudont超过 13 年前

jasondavies超过 13 年前

评论 #2934515 未加载

guffwhitehill超过 13 年前

The non-parametric stats functions in R are better than Py

R vs Python for simple interactive data analysis

5 条评论

R vs Python for simple interactive data analysis

5 条评论