TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Hacker News on BigQuery: Now with daily updates. Top domains and time to post?

52 pointsby fhoffaabout 8 years ago

2 comments

minimaxirabout 8 years ago
A few comments about on Hacker News data (i.e why I haven&#x27;t played with the data in awhile):<p>1. The algorithm changed recently. This post uses &gt;40pts as a proxy for front pageness. That&#x27;s too conservative; even my 10pt threshold back then was conservative. With recent algorithm changes to Hacker News (&lt;1 yr), I&#x27;ve seen posts with <i>3pts</i> get into the Top 10 for whatever reason, which breaks predictive analysis.<p>2) The dataset&#x2F;this submission only includes submissions&#x2F; submission scores; comment scores were removed from the API which is disappointing.<p>3) Given that HN titles&#x2F;links can be edited by moderators (and they do a good job), it&#x27;s harder to judge initial submissions from the final result.<p>4) Slight edge case in the article, but link shorteners are auto-killed which is why youtu.be&#x2F;goo.gl links are not prominent.
评论 #13898412 未加载
评论 #13911481 未加载
koolbaabout 8 years ago
How does the data get to BigQuery? Anything special&#x2F;fun or just repeatedly polling the API endpoint?
评论 #13898380 未加载