TE
科技回声
首页
24小时热榜
最新
最佳
问答
展示
工作
中文
GitHub
Twitter
首页
I just finished crawling 5.19B web pages, Ask Me Anything
19 点
作者
dor_jack
大约 8 年前
I WAS JUST RATE LIMITED BY HN, SO IM GOING TO ANSWER YOUR QUESTIONS UNDER A NEW ACCOUNT: dor_jack_2
7 条评论
grzm
大约 8 年前
If you're rate-limited, you can contact the mods via the Contact link in the footer.
dm_i386
大约 8 年前
Collapse
What tools did you use? What had to be custom-written and why?
评论 #14152751 未加载
maurtinshkreli
大约 8 年前
Collapse
How much did it cost?
评论 #14153104 未加载
tlack
大约 8 年前
Collapse
what did you do to avoid winding up in endless GET url loops? How deep did you get per site, and how did you schedule followup requests?
评论 #14152778 未加载
joshpen188
大约 8 年前
Collapse
Why didn't you use common crawl instead?
评论 #14152761 未加载
savethefuture
大约 8 年前
Collapse
What did you discover.
评论 #14152580 未加载
评论 #14152619 未加载
评论 #14152559 未加载
itburnslikeice
大约 8 年前
Collapse
but why?
评论 #14152573 未加载