TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Data Archive of Hacker News

2 点作者 unwantedLetters将近 15 年前
I was wondering if anyone had prepared some sort of data archive for Hacker News. Something fairly simple, having the attributes id, title, url, user, score, last time score changed etc.?<p>I was thinking it would be an extremely interesting and valuable dataset.<p>I had some sample idea that I would like to try:<p>1. Is there a best time to post on HN? (I know this is very SEO, but I think it's an interesting question nonetheless)<p>2. It might be fun to cluster the data (perhaps all articles with score &#62; 5), and see the top X articles in every cluster. I think that'll give you a wide variety of extremely good articles to read.<p>I know that this isn't the most enlightening or groundbreaking work, but I'm sure if we had the dataset, we would be able to come up with some interesting ways to analyze the data and come up with some nice results. (In fact, if anyone can think of some other interesting ways to analyze the dataset, can you post anyway, I'd like to hear them).<p>I was actually putting together a little script that scrapes HN and puts the data into a MySQL database, but that doesn't seem to be a good idea since it would hit the servers unnecessarily. Also, I'm not sure people would like me doing that.

暂无评论

暂无评论