Statistics with Julia [pdf]

470 pointsby aapelialmost 6 years ago

16 comments

superdimwitalmost 6 years ago

I'd really recommend anyone doing mildly numerical / data-ey work in python to give Julia a patient and fair try.I think the language is really solidly designed, and gives you ridiculously more power AND productivity than python for a whole range of workloads. There are of course issues, but even in the short time I've been following & using the language these are being rapidly addressed. In particular: generally less rich system of libraries (but some Julia libraries are state of the art across all languages, mainly due to easy metaprogramming and multiple dispatch) + generally slow compile times (but this is improving rapidly with caching etc). I would also note that you often don't really need as many "libraries" as you do in python or R, since you can typically just write down the code you want to write, rather than being forced to find a library that wraps a C/C++ implementation like in python/r.

评论 #20424282 未加载

评论 #20423880 未加载

jointpdfalmost 6 years ago

This looks like a good reference for the fundamentals of both statistics and Julia, as claimed. I have a small critique, since the authors asked for suggestions.The format for the code samples goes like (code chunk —> output/plots —> bullet points explaining the code line-by-line). This creates a bit of a readability issue. The reader will likely follow a pattern like: (Skim past the code chunk to the explanation —> Read first bullet, referencing line X —> Go back to code to find line X, keeping the explanation in mental memory —> Read second bullet point —> ...). In other words, too much switching/scrolling between sections that can be pages apart. Look at the example on pages 185-187 to see what I mean.I’m not sure what the optimal solution is. Adding comments in the code chunks themselves adds clutter and is probably worse (not to mention creates formatting nightmares). I think my favorite format is two columns, with the code on the left side and the explanations on the right.Here’s what I have in mind (doesn’t work on mobile): <a href="https://allennlp.org/tutorials" rel="nofollow">https://allennlp.org/tutorials</a>. Does anyone know of a solution for formatting something like this?

评论 #20430979 未加载

评论 #20422231 未加载

评论 #20422715 未加载

xvilkaalmost 6 years ago

Note that Julia 1.2[1] is on the verge[2] of being released. Also, it is interesting to see the list[3] of GSoC and JSoC (Julia's own Summer of Code). A lot of projects target the ML/AI applications. Personally, I am waiting for proper GNN support[4] in FluxML, but seems not much interest in it.[1] <a href="https://github.com/JuliaLang/julia/milestone/30" rel="nofollow">https://github.com/JuliaLang/julia/milestone/30</a>[2] <a href="https://discourse.julialang.org/t/julia-v1-2-0-rc2-is-now-available/26170" rel="nofollow">https://discourse.julialang.org/t/julia-v1-2-0-rc2-is-now-av...</a>[3] <a href="https://julialang.org/blog/2019/05/jsoc19" rel="nofollow">https://julialang.org/blog/2019/05/jsoc19</a>[4] <a href="https://github.com/FluxML/Flux.jl/issues/625" rel="nofollow">https://github.com/FluxML/Flux.jl/issues/625</a>

caiocaiocaioalmost 6 years ago

Julia looked interesting to me, so I tried 1.0 after it came out. I have a oldish laptop (fine for my needs), and every time I tried to do seemingly anything, it spent ~5 minutes recompiling libraries or something. So I've been waiting newer versions that hopefully stop doing that, or for me to buy a better computer.

评论 #20421503 未加载

评论 #20421362 未加载

评论 #20422582 未加载

ChrisRackauckasalmost 6 years ago

This is a very good resource. The one thing I would ask is that I would like to see examples of using DifferentialEquations.jl when you get to the section on dynamical systems, especially when doing discrete event simulation and stochastic differential equations. I opened an issue in the repo and we can continue discussing there (I'll help write the code, I want to use this in my own class :P)!

评论 #20421684 未加载

评论 #20431000 未加载

adamnemecekalmost 6 years ago

I invite everyone to check out julia. The language is pleasant and gets out of the way. The interop is nuts. To call say numpy fft, you just dousing PyCallnp = pyimport("numpy")np.fft.fft(rand(ComplexF64, 10))Thats it. You call it with a julia native array, the result is in a julia native array as well.Same with cpp<a href="https://github.com/JuliaInterop/Cxx.jl" rel="nofollow">https://github.com/JuliaInterop/Cxx.jl</a>Or matlab<a href="https://github.com/JuliaInterop/MATLAB.JL" rel="nofollow">https://github.com/JuliaInterop/MATLAB.JL</a>It's legit magic

评论 #20421369 未加载

bdod6almost 6 years ago

Can someone explain how this is more powerful than someone use an Python/R based workflow? E.g., I currently use a combination .ipynb, python scripts, and RStudio and this feels like it covers everything I need for any data science project.

评论 #20422212 未加载

评论 #20422103 未加载

aapelialmost 6 years ago

Accompanying code here: <a href="https://github.com/h-Klok/StatsWithJuliaBook" rel="nofollow">https://github.com/h-Klok/StatsWithJuliaBook</a>

Merrillalmost 6 years ago

In section "1.2 Setup and Interface" there is a very short description of the REPL and how it can be downloaded from julialang.org, as well as a much longer description of JuliaBox and how Jupyter notebooks can be run from juliabox.com for free.Although JuliaBox has been provided for free by Julia Computing, there has been discussion that this may not be possible in the future. However, Julia Computing does provide a distribution of Julia, the Juno IDE, and supported packages known as JuliaPro for free.For new users, would the free JuliaPro distribution be a good alternative to JuliaBox and/or downloading the REPL and kernal from julialang.org?

评论 #20423384 未加载

cwyersalmost 6 years ago

For people who have more Julia experience -- is this (thinking mainly of chapter 4) representative of how most Julia users do plotting? It looks like a lot of calling out to matplotlib via PyPlot. I know Julia has a ggplot-inspired library called Gadfly.jl, is PyPlot more commonly used?

评论 #20422648 未加载

评论 #20422544 未加载

评论 #20422570 未加载

dlphn___xyzalmost 6 years ago

whats the selling point with Julia? why would i use it over something like R?

评论 #20422312 未加载

评论 #20422221 未加载

jbee618almost 6 years ago

Would love to see chapter exercises to test comprehension and reinforce learning objectives.

chakerbalmost 6 years ago

I was going to ask is there any Kindle version of this, then I skimmed over the book, and I don't think it will be readable on a Kindle. And even if it does, the reading experience will definitely be inferior.

评论 #20431049 未加载

mrutsalmost 6 years ago

Julia is everything python could have been, and much more. I'm stuck with python right now as a lot of people in the data science/ML community are, but it's becoming increasingly viable to use Julia for "real" work. The Python-Julia interop story is pretty strong as well, which allows you to (somewhat) easily convert pandas/pytorch/sklearn code into Julia using Python wrappers. Julia has some unconventional things in it but they are all growing on me:1. Indices by default start with 1. This honestly makes a ton of sense and off by one errors are less likely to happen. You have nice symmetry between the length of a collection and the last element, and in general just have to do less "+ 1" or "- 1" things in your code.2. Native syntax for creation of matrices. Nicer and easier to use than ndarray in Python.3. Easy one-line mathematical function definitions: f(x) = 2*x. Also being able to omit the multiplication sign (f(x) = 2x) is super nice and makes things more readable.4. Real and powerful macros ala lisp.5. Optional static typing. Sometimes when doing data science work static typing can get in your way (more so than for other kinds of programs), but it's useful to use most of the time.6. A simple and easy to understand polymorphism system. Might not be structured enough for big programs, but more than suitable for Julia's niche.Really the only thing I don't like about the language is the begin/end block syntax, but I've mentioned that before on HN and don't need to get into it again.

评论 #20422862 未加载

评论 #20422956 未加载

评论 #20421907 未加载

评论 #20422265 未加载

abakusalmost 6 years ago

I find Julia's .> , .==, .*, ./ (dots for element-by-element ufunc)... really ugly. Numpy's design is cleaner and better.

评论 #20425534 未加载

plouffyalmost 6 years ago

Commenting to find later.

评论 #20421208 未加载

评论 #20422156 未加载