TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

I just finished crawling 5.19B web pages, Ask Me Anything

19 点作者 dor_jack大约 8 年前
I WAS JUST RATE LIMITED BY HN, SO IM GOING TO ANSWER YOUR QUESTIONS UNDER A NEW ACCOUNT: dor_jack_2

7 条评论

grzm大约 8 年前
If you're rate-limited, you can contact the mods via the Contact link in the footer.
dm_i386大约 8 年前
What tools did you use? What had to be custom-written and why?
评论 #14152751 未加载
maurtinshkreli大约 8 年前
How much did it cost?
评论 #14153104 未加载
tlack大约 8 年前
what did you do to avoid winding up in endless GET url loops? How deep did you get per site, and how did you schedule followup requests?
评论 #14152778 未加载
joshpen188大约 8 年前
Why didn't you use common crawl instead?
评论 #14152761 未加载
savethefuture大约 8 年前
What did you discover.
评论 #14152580 未加载
评论 #14152619 未加载
评论 #14152559 未加载
itburnslikeice大约 8 年前
but why?
评论 #14152573 未加载