66 pointsby foobabout 8 years ago

3 comments

hartatorabout 8 years ago

Really cool, congrats!<p>I have built something similar, but to retrieve a backup for one of my dead websites. It was a fun project.<p>Shameless plug: <a href="https://github.com/hartator/wayback-machine-downloader/" rel="nofollow">https://github.com/hartator/wayback-machine-downloader/</a>

natchabout 8 years ago

Do they no longer have a program like they used to where researchers can apply for direct access to the crawl data?

评论 #14046362 未加载

deferredpostsabout 8 years ago

So what is the policy of The Internet Archive on this level of scraping? Do they have a rate limit in place?

评论 #14043524 未加载

Internet Archaeology: Scraping time series data from Archive.org

3 comments

Internet Archaeology: Scraping time series data from Archive.org

3 comments