TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What Happened to the Cuil Crawl Data?

4 pointsby arayhover 6 years ago
I was trying to look through the Cuil crawl data on archive.org but nothing comes up from the collection tab under any search query. I also can&#x27;t seem to find anything that suggests that the Cuil crawl data was removed from archive.org. Any idea what happened here? Has anyone been able to access it recently? Maybe it&#x27;s just temporarily down without any notice?<p>https:&#x2F;&#x2F;archive.org&#x2F;details&#x2F;cuilcrawl

2 comments

soultover 6 years ago
The items from the Cuil collection were &quot;made dark&quot; (i.e. you can not directly view&#x2F;download them). I do not know the specifics, but as far as I know most web crawl data on the Internet Archive is not directly downloadable, but you can use the Wayback machine if you are looking for a copy of a specific website.<p>Depending on what you need the data for, Common Crawl[1] might be an alternative.<p>1: <a href="http:&#x2F;&#x2F;commoncrawl.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;commoncrawl.org&#x2F;</a>
sp332over 6 years ago
If you don&#x27;t get an answer here, try asking Jason Scott @textfiles on Twitter. He&#x27;s kind of the &quot;face&quot; of the Internet Archive and he&#x27;s pretty good at directing queries like this to the right people.