TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to Download a List of All Registered Domain Names

170 点作者 jwcrux超过 9 年前

11 条评论

jtwaleson超过 9 年前
A couple of months ago I processed all metadata from the Common Crawl project for all indexed domain names. This was about 10TB of metadata and resulted in 26 million domain names. EC2 costs were only about 10$ to process this. If anyone is interested, let me know.<p>edit: available as torrent here: <a href="https:&#x2F;&#x2F;all-certificates.s3.amazonaws.com&#x2F;domainnames.gz?torrent" rel="nofollow">https:&#x2F;&#x2F;all-certificates.s3.amazonaws.com&#x2F;domainnames.gz?tor...</a>
评论 #10369231 未加载
评论 #10371316 未加载
评论 #10369015 未加载
评论 #10370565 未加载
评论 #10369038 未加载
nly超过 9 年前
A warning about parsing zone files... the grammar is deceptively tricky.<p>While TLD registries will <i>probably</i> provide you with files in a sane subset[0] of that specified in RFC 1035, there are a number of things that will <i>NOT</i> work in general:<p>- Splitting the file in to lines (paren-blocks and quoted strings can span lines, strings can contain &#x27;;&#x27; etc).<p>- Splitting the file on whitespace (it&#x27;s significant in column 1 and inside strings)<p>- Applying a regex (you&#x27;ll need lookahead for conditional matching and it&#x27;ll get ugly fast)<p>Don&#x27;t go down the road of assuming it&#x27;s a simple delimited file.<p>A few references:<p><a href="https:&#x2F;&#x2F;www.nlnetlabs.nl&#x2F;projects&#x2F;nsd&#x2F;documentation.html" rel="nofollow">https:&#x2F;&#x2F;www.nlnetlabs.nl&#x2F;projects&#x2F;nsd&#x2F;documentation.html</a><p><a href="http:&#x2F;&#x2F;www.verycomputer.com&#x2F;96_5ad11cc47053d8b0_1.htm" rel="nofollow">http:&#x2F;&#x2F;www.verycomputer.com&#x2F;96_5ad11cc47053d8b0_1.htm</a><p>[0] See page 9 of <a href="https:&#x2F;&#x2F;archive.icann.org&#x2F;en&#x2F;topics&#x2F;new-gtlds&#x2F;zfa-strategy-paper-12may10-en.pdf" rel="nofollow">https:&#x2F;&#x2F;archive.icann.org&#x2F;en&#x2F;topics&#x2F;new-gtlds&#x2F;zfa-strategy-p...</a>
评论 #10373766 未加载
irfan超过 9 年前
I had been downloading the zone file for .PK domains on daily bases until they blocked the zone transfers. Based on comparison of these daily zone files I managed to publish the statistics [1] and also broke the news about hacked .PK domains [2] which was picked up by all leading tech blogs and news agencies.<p>Currently, I cannot find a way to get the zone file even by officially requesting the registry manager.<p>[1]: <a href="https:&#x2F;&#x2F;www.i.com.pk&#x2F;pknic-domain-registration-statistics&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.i.com.pk&#x2F;pknic-domain-registration-statistics&#x2F;</a><p>[2]: <a href="https:&#x2F;&#x2F;www.i.com.pk&#x2F;110-pk-domains-managed-by-markmonitor-got-hacked-by-turkish-hackers&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.i.com.pk&#x2F;110-pk-domains-managed-by-markmonitor-g...</a>
vortico超过 9 年前
What if someone were to maintain an unofficial list with one domain per line, freely available as a daily torrent or served directly? Would there be a rights problem with mirroring and filtering ICANN data?
评论 #10368837 未加载
评论 #10368846 未加载
评论 #10368510 未加载
评论 #10368395 未加载
评论 #10368387 未加载
评论 #10369811 未加载
axaxs超过 9 年前
FWIW, a TLD zone file does not contain every registered domain name, just those with DNS records. There is typically a good amount of domain names registered but without records, for reasons such as reserved names, malicious content takedowns, etc.
评论 #10370507 未加载
zamalek超过 9 年前
Now someone just needs to train an NN to recognize botnets and spam domains.
评论 #10368916 未加载
jlgaddis超过 9 年前
Interesting, thanks for the pointer.<p>I&#x27;ve wondered about this previously as I run my own blacklists for $work&#x27;s mail servers, thinking about how I could slightly &quot;penalize&quot; brand new domain names and such, correlating &quot;spammy&quot; domains with certain nameservers and such.
评论 #10368501 未加载
ben_utzer超过 9 年前
This list would be useful for my attempt for a list of parked&#x2F;squatted domains..
canow超过 9 年前
would it be easier to download a list of available domains?
评论 #10370474 未加载
mike-cardwell超过 9 年前
What about CCTLDs?
评论 #10372933 未加载
评论 #10371074 未加载
ps4fanboy超过 9 年前
I think its sad how closed this data is.
评论 #10370484 未加载