TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How to Download a List of All Registered Domain Names

170 pointsby jwcruxover 9 years ago

11 comments

jtwalesonover 9 years ago
A couple of months ago I processed all metadata from the Common Crawl project for all indexed domain names. This was about 10TB of metadata and resulted in 26 million domain names. EC2 costs were only about 10$ to process this. If anyone is interested, let me know.<p>edit: available as torrent here: <a href="https:&#x2F;&#x2F;all-certificates.s3.amazonaws.com&#x2F;domainnames.gz?torrent" rel="nofollow">https:&#x2F;&#x2F;all-certificates.s3.amazonaws.com&#x2F;domainnames.gz?tor...</a>
评论 #10369231 未加载
评论 #10371316 未加载
评论 #10369015 未加载
评论 #10370565 未加载
评论 #10369038 未加载
nlyover 9 years ago
A warning about parsing zone files... the grammar is deceptively tricky.<p>While TLD registries will <i>probably</i> provide you with files in a sane subset[0] of that specified in RFC 1035, there are a number of things that will <i>NOT</i> work in general:<p>- Splitting the file in to lines (paren-blocks and quoted strings can span lines, strings can contain &#x27;;&#x27; etc).<p>- Splitting the file on whitespace (it&#x27;s significant in column 1 and inside strings)<p>- Applying a regex (you&#x27;ll need lookahead for conditional matching and it&#x27;ll get ugly fast)<p>Don&#x27;t go down the road of assuming it&#x27;s a simple delimited file.<p>A few references:<p><a href="https:&#x2F;&#x2F;www.nlnetlabs.nl&#x2F;projects&#x2F;nsd&#x2F;documentation.html" rel="nofollow">https:&#x2F;&#x2F;www.nlnetlabs.nl&#x2F;projects&#x2F;nsd&#x2F;documentation.html</a><p><a href="http:&#x2F;&#x2F;www.verycomputer.com&#x2F;96_5ad11cc47053d8b0_1.htm" rel="nofollow">http:&#x2F;&#x2F;www.verycomputer.com&#x2F;96_5ad11cc47053d8b0_1.htm</a><p>[0] See page 9 of <a href="https:&#x2F;&#x2F;archive.icann.org&#x2F;en&#x2F;topics&#x2F;new-gtlds&#x2F;zfa-strategy-paper-12may10-en.pdf" rel="nofollow">https:&#x2F;&#x2F;archive.icann.org&#x2F;en&#x2F;topics&#x2F;new-gtlds&#x2F;zfa-strategy-p...</a>
评论 #10373766 未加载
irfanover 9 years ago
I had been downloading the zone file for .PK domains on daily bases until they blocked the zone transfers. Based on comparison of these daily zone files I managed to publish the statistics [1] and also broke the news about hacked .PK domains [2] which was picked up by all leading tech blogs and news agencies.<p>Currently, I cannot find a way to get the zone file even by officially requesting the registry manager.<p>[1]: <a href="https:&#x2F;&#x2F;www.i.com.pk&#x2F;pknic-domain-registration-statistics&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.i.com.pk&#x2F;pknic-domain-registration-statistics&#x2F;</a><p>[2]: <a href="https:&#x2F;&#x2F;www.i.com.pk&#x2F;110-pk-domains-managed-by-markmonitor-got-hacked-by-turkish-hackers&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.i.com.pk&#x2F;110-pk-domains-managed-by-markmonitor-g...</a>
vorticoover 9 years ago
What if someone were to maintain an unofficial list with one domain per line, freely available as a daily torrent or served directly? Would there be a rights problem with mirroring and filtering ICANN data?
评论 #10368837 未加载
评论 #10368846 未加载
评论 #10368510 未加载
评论 #10368395 未加载
评论 #10368387 未加载
评论 #10369811 未加载
axaxsover 9 years ago
FWIW, a TLD zone file does not contain every registered domain name, just those with DNS records. There is typically a good amount of domain names registered but without records, for reasons such as reserved names, malicious content takedowns, etc.
评论 #10370507 未加载
zamalekover 9 years ago
Now someone just needs to train an NN to recognize botnets and spam domains.
评论 #10368916 未加载
jlgaddisover 9 years ago
Interesting, thanks for the pointer.<p>I&#x27;ve wondered about this previously as I run my own blacklists for $work&#x27;s mail servers, thinking about how I could slightly &quot;penalize&quot; brand new domain names and such, correlating &quot;spammy&quot; domains with certain nameservers and such.
评论 #10368501 未加载
ben_utzerover 9 years ago
This list would be useful for my attempt for a list of parked&#x2F;squatted domains..
canowover 9 years ago
would it be easier to download a list of available domains?
评论 #10370474 未加载
mike-cardwellover 9 years ago
What about CCTLDs?
评论 #10372933 未加载
评论 #10371074 未加载
ps4fanboyover 9 years ago
I think its sad how closed this data is.
评论 #10370484 未加载