The GitHub Archive dataset was updated as well. Example BigQuery to get the Top Repositories from 2015-2016 YTD, by the number of Stars given during that time:<p><pre><code> SELECT repo.id, repo.name, COUNT(*) as num_stars
FROM TABLE_DATE_RANGE([githubarchive:day.], TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE type = "WatchEvent"
GROUP BY repo.id, repo.name
ORDER BY num_stars DESC
LIMIT 1000
</code></pre>
Which results in this output: <a href="https://docs.google.com/spreadsheets/d/16yDS2wDdDOTxjVsjGvWmpHVsOIU65wLEjXFHDtDeKU4/edit?usp=sharing" rel="nofollow">https://docs.google.com/spreadsheets/d/16yDS2wDdDOTxjVsjGvWm...</a><p>Since the query only hits 3 columns, it only uses 15.4GB of data (out of a 1TB allowance)<p>More information on the GitHub Archive changes: <a href="https://medium.com/@hoffa/github-archive-fully-updated-notice-some-breaking-changes-64e7e7cd0967" rel="nofollow">https://medium.com/@hoffa/github-archive-fully-updated-notic...</a>