TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: On crawling email ids

3 pointsby amolguptaabout 10 years ago
People use the format abc[at]companyname[dot]com to escape crawlers from getting their email ids when posting on public formus. How efficient is that?Cant the crawlers just change their regex to parse this patterns as well ?

4 comments

27182818284about 10 years ago
Yes and more to that point, it is trivially easy. A Stanford class had you case out for the [at] in an early assignment of writing a spam bot a few years ago. <a href="http:&#x2F;&#x2F;www.google.com&#x2F;recaptcha&#x2F;mailhide&#x2F;apikey" rel="nofollow">http:&#x2F;&#x2F;www.google.com&#x2F;recaptcha&#x2F;mailhide&#x2F;apikey</a> might be more effective, but the best thing is just to have a great spam filter and accept that someone will guess or find your email rather than trying to hide it.<p>Remember that an email address can be spread around more than by crawlers too. Sign up to a grocery store discount card with it? It is in some for-sale database somewhere.
wglbabout 10 years ago
Pure speculation here: crawlers go for bulk and likely don&#x27;t care if they pick up garbage. There may be another level of email harvesting that goes to the level that you suggest, but in seeing all the conventions that are used like the one you show, such code would have to cover lots of them. Return might be very low.
评论 #9458100 未加载
jrs235about 10 years ago
I would assume some crawlers are configured to find and scrape email addresses using that (now) &quot;de facto&quot; form and similar ones. With that said, I recently saw someone use a form similar to:<p>Email me at abc shift+2 key companyname period com<p>which, until that becomes more common, might offer better protection from scrapers.
HaseebR7about 10 years ago
just take a screenshot of the text. like this<p><a href="http:&#x2F;&#x2F;i.imgur.com&#x2F;NF2nqbO.png" rel="nofollow">http:&#x2F;&#x2F;i.imgur.com&#x2F;NF2nqbO.png</a>