This post highlights that there are indeed some significant untapped opportunities in mining GitHub user and repository data. As I was working on the 2nd Edition of Mining the Social Web last year, I observed the very same thing and introduced an entire chapter that models GitHub as a interest graph. (Think: users are interested in projects and programming languages by extension.) The IPython Notebook with all of the sample code is available with all of the other source [1] but really just begins to scratch the surface with some rudimentary centrality techniques. Like any other interest graph, the possibilities are fairly endless.<p>[1] <a href="http://nbviewer.ipython.org/github/ptwobrussell/Mining-the-Social-Web-2nd-Edition/blob/master/ipynb/Chapter%207%20-%20Mining%20GitHub.ipynb" rel="nofollow">http://nbviewer.ipython.org/github/ptwobrussell/Mining-the-S...</a>
Tried with /JuliaLang/julia and got garbage results - my guess is that the build instructions in the README dominate. Trying something like /JuliaOpt/Optim.jl, which has a very on-topic README, faired slightly better but still had some bizzare things like /sergiotapia/go-style-guide
I like the idea of your project, but it seems like the algorithmic database version of wikipedia that you plan to profiteer off of?<p>Words like marketplace, crowdsourced, and open platform played well in 2005 but now they kinda smell like a scam.