TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Tell HN: The Hacker News frontpage effect on a project GitHub stars

16 点作者 fhoffa超过 9 年前
How much attention does a Hacker News frontpage post drive to a GitHub project?<p>For this visualization I combined 2 datasets: GitHub Archive (http:&#x2F;&#x2F;www.githubarchive.org&#x2F;) and Hacker News (https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10440502), both living in BigQuery (https:&#x2F;&#x2F;cloud.google.com&#x2F;bigquery&#x2F;what-is-bigquery, https:&#x2F;&#x2F;reddit.com&#x2F;r&#x2F;bigquery).<p>The visualizations were built with Google Cloud Datalab (https:&#x2F;&#x2F;cloud.google.com&#x2F;datalab&#x2F;, Jupyter&#x2F;IPython notebooks on the cloud).<p>With one SQL query you can extract the daily number of stars a project gets, and with another one the GitHub urls that were submitted to the Hacker News - or combine both queries in one:<p><pre><code> SELECT repo_name, created_at date, COUNT(*) c, GROUP_CONCAT_UNQUOTED(UNIQUE(hndate+&#x27;:&#x27;+STRING(hnscore))) hndates, SUM(UNIQUE(hnscore)) hnscore, SUM(c) OVER(PARTITION BY repo_name) monthstars FROM ( SELECT repo_name, actor_login, DATE(MAX(created_at)) created_at, date hndate, score hnscore FROM [githubarchive:month.201509] a JOIN ( SELECT REGEXP_EXTRACT(url, r&#x27;github.com&#x2F;([a-zA-Z0-9\-\.]+.[a-zA-Z0-9\-\.]*)&#x27;) mention, DATE(time_ts) date, score FROM [fh-bigquery:hackernews.stories] WHERE REGEXP_MATCH(url, r&#x27;github.com&#x2F;[a-zA-Z0-9\-\.]+&#x27;) AND score&gt;10 AND YEAR(time_ts)=2015 AND MONTH(time_ts)=9 HAVING NOT (mention CONTAINS &#x27;.com&#x2F;search?&#x27; OR mention CONTAINS &#x27;.com&#x2F;blog&#x2F;&#x27;) ) b ON a.repo_name=b.mention WHERE type=&quot;WatchEvent&quot; GROUP BY 1,2, hndate, hnscore ) GROUP BY 1,2 HAVING hnscore&gt;300 ORDER BY 1,2,4 LIMIT 1000 </code></pre> The visualization: https:&#x2F;&#x2F;i.imgur.com&#x2F;B5awmAL.png<p>(correlation is no causation, but there is indeed correlation between both)<p>--@felipehoffa

2 条评论

fhoffa超过 9 年前
Links:<p>- Visualization: <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;B5awmAL.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;B5awmAL.png</a><p>- GitHub Archive: <a href="http:&#x2F;&#x2F;www.githubarchive.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;www.githubarchive.org&#x2F;</a><p>- Hacker News on BigQuery dataset: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10440502" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10440502</a><p>- Other BigQuery projects: <a href="https:&#x2F;&#x2F;reddit.com&#x2F;r&#x2F;bigquery" rel="nofollow">https:&#x2F;&#x2F;reddit.com&#x2F;r&#x2F;bigquery</a>
veddox超过 9 年前
An effect like that is to be expected, but it is interesting to see some real data.