TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The Public Suffix List

79 pointsby dan1234almost 9 years ago

9 comments

profmonoclealmost 9 years ago
If you use this for something, please be sure to keep the data fresh! Especially now that new TLDs are being added at a steady clip.<p>I once had an issue where a mobile user&#x27;s browser would do a Google search every time they typed one of our domains into the URL bar. They weren&#x27;t doing anything wrong - other domains worked fine. They were using some Android phone from 2013 which had been abandoned by the manufacturer, and it turned out the stock browser was using the public suffix list to decide if the input should be treated as a URL or as a search query. The list was as old as the browser, and since the domain ended in &quot;.link&quot; (added in 2014) it didn&#x27;t think our domain was a domain. (The workaround was to type out <a href="http:&#x2F;&#x2F;" rel="nofollow">http:&#x2F;&#x2F;</a> or <a href="https:&#x2F;&#x2F;" rel="nofollow">https:&#x2F;&#x2F;</a> before the domain, but that&#x27;s crappy UX.)<p>If you use this in a server-side app, please use cron or something to keep it updated. If you embed it in a client app, make sure to update it as part of your build process. And if you know your app won&#x27;t be updated very often, consider having it update separately from the app itself. (Does anyone know if this is hosted on some public CDN somewhere? I assume having a client app fetch directly from publicsuffix.org isn&#x27;t kosher.)<p>Edit: My mistake, on <a href="https:&#x2F;&#x2F;publicsuffix.org&#x2F;list&#x2F;" rel="nofollow">https:&#x2F;&#x2F;publicsuffix.org&#x2F;list&#x2F;</a> they do recommend having client apps pull directly from their site, and limiting updates to once per day.
评论 #12313752 未加载
randomstringalmost 9 years ago
This list is used to determine what domains and sub-domains &quot;belong&quot; together, in the sense that they are controlled and&#x2F;or owned by the same entity. For instance *.google.com is all google, so x.google.com and y.google.com can be trusted to share the same SSL key, safely share javascript (XSS), etc. However x.blogger.com and y.blogger.com are probably two completely separate blogs, people, domains, SSL keys, javascript domains, etc. And you wouldn&#x27;t want to see x.blogger.com&#x27;s web pages showing up in search results for y.blogger.com.<p>Secondly, maintaining this list is a pain. You have hundreds of two letter top level domains, one for each country. Each country with its own NIC in charge of sub-domains. Each NIC with the power to add or delete subdomains (check out <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;.uk" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;.uk</a> for just one of hundreds of examples). Some &quot;countries&quot; even sell off their top level domain like .nu . Then you have .us (<a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;.us" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;.us</a>) that has wild card domains like <a href="http:&#x2F;&#x2F;vil.stockbridge.mi.us&#x2F;" rel="nofollow">http:&#x2F;&#x2F;vil.stockbridge.mi.us&#x2F;</a> where the vil. is fixed and part of the domain and stockbridge.mi is the important domain information. Of course they&#x27;re always adding more top level domains: .ninja, .wtf, etc (maybe there is a .etc now?). Then you have all the blogging and hosting platforms that use personalized domains for hosting content. Many are listed in the publicsuffix list, but I&#x27;m guessing not all!<p>I ended up writing my own publicsuffix parser in PERL a few years back for the blekko search engine. The main purpose being to be able to group web pages together by site&#x2F;owner. There is nothing quite like feeding every URL on the internet through your parser to find bugs and corner cases.
trevealmost 9 years ago
A great and important list.<p>It also feels a bit like a hack. Would there a be a better way to do this, maybe in the DNS system to denote ownership&#x2F;isolation?
评论 #12312832 未加载
评论 #12315188 未加载
评论 #12315798 未加载
yrroalmost 9 years ago
I have always wondered why this information is not stored in the DNS itself.
评论 #12312850 未加载
cdubzzzalmost 9 years ago
I recently found this list and use it in a Drupal module to invoke hooks based on the domain of a URL. I didn&#x27;t realize how many edge cases there are for these and originally tried to do it with regex.
jasonjeialmost 9 years ago
I do hope they use the same care and due process as they do with root certificate inclusion in the public suffix list (as well as the ability to add&#x2F;remove entries manually). I think this is a great move for better security, but with great power comes great responsibility.
derefralmost 9 years ago
@HN mods: this list could be used to make the domain in parens after the link more accurate, no?
DanielDentalmost 9 years ago
Quick plug for my related project: <a href="https:&#x2F;&#x2F;github.com&#x2F;QA2&#x2F;public-suffix-metalist" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;QA2&#x2F;public-suffix-metalist</a> - pull requests welcome.
throwanemalmost 9 years ago
A domain hacker&#x27;s playground.
评论 #12311780 未加载
评论 #12312149 未加载