TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How do spammers harvest your e-mail address?

27 pointsby karangoeluwabout 11 years ago

10 comments

minimaxirabout 11 years ago
This is the third time you've posted this link in as many days. (Although it appears that your strategy worked.) Note for the future that deleting then resubmitting links is against HN rules.
评论 #7631196 未加载
birkenabout 11 years ago
A couple notes that aren&#x27;t related to the content of the post:<p>1) The graphs are not presented in a way that makes them easy to consume. The font is too small, the bars are too densely combined, the axis labels are not descriptive enough (&quot;percentage of emails posted&quot;), and there is no discernible ordering of the bars (alphabetical, by value, etc). Presenting your data in an a way that is easy to consume is just as important as having worthwhile data to present, because a general audience like this isn&#x27;t going to struggle to parse those plots, they are just going to move on.<p>2) Considering you are doing data analysis with Python, you should check out pandas (<a href="http://pandas.pydata.org/" rel="nofollow">http:&#x2F;&#x2F;pandas.pydata.org&#x2F;</a>). It will not only make the data easier to work with, but it will do plotting for you with better defaults than you have chosen, and you will drastically cut down on having to write matplotlib code (a worthwhile benefit!).
评论 #7631829 未加载
slavik81about 11 years ago
<i>&quot;Tech companies, like Google and Yahoo, use about 30 billion watts of electricity (1) - that&#x27;s enough electricity to power 3 million houses for a year.&quot;</i> Powering 3 million houses is a measure of power. Powering 3 million houses for a year is a measure of energy. 30GW, however, is a measure of power.
JacobAldridgeabout 11 years ago
Funnily enough, it was the topic of spam and building a better spam filter that first introduced me to pg&#x27;s essays (and thence, to HN).<p>It doesn&#x27;t look like his Spam page has been updated in a long time (<a href="http://paulgraham.com/antispam.html" rel="nofollow">http:&#x2F;&#x2F;paulgraham.com&#x2F;antispam.html</a>), which reflects for me the quality of spam filters now compared to 2002-2005 when most of those essays were written. Incidentally, they&#x27;re a great way to learn about Bayesian Filtering as well !
thaumaturgyabout 11 years ago
There seems to be two sources missing from you list of &quot;platforms&quot;, based on some recent experiences (I&#x27;m a mail server admin and I put a lot of effort into tracking down and blocking spam):<p>1. Hotel registration. I was asked for my email address when staying at a Hyatt for the BSides Conference in SF a while back. I didn&#x27;t even think twice about providing my standard email address, and within a week, started receiving a lot of extra spam. I tracked some of it down to a company that has affiliations with hotel networks, so I&#x27;m pretty sure it came from the registration process.<p>2. Public wifi hotspots. On this one, I dunno when or where I absent-mindedly entered my email address, but again, followed some of the spam back to a marketing company affiliated with public hotspots. Bastards.<p>It&#x27;s fairly persistent spam, and it&#x27;s walking right past greylisting, SpamAssassin, and my usual filters for bad actors.
nedwinabout 11 years ago
This is an awesome piece of research - I feel your pain that you can&#x27;t get it published due to the Gmail data issue!
karangoeluwabout 11 years ago
Some devices are having issues with responsive layout. In that case, use this link: <a href="https://github.com/karan/karan.github.io/blob/master/_posts/2014-03-26-email-spam.markdown" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;karan&#x2F;karan.github.io&#x2F;blob&#x2F;master&#x2F;_posts&#x2F;...</a>
betterunixabout 11 years ago
Does it matter? Email addresses are harvested by spammers all the time. The key piece of the puzzle now is that spam filtering is advanced enough that we do not need to care. I almost never see spam in my inbox, and I almost never see ham in my junk folder.
评论 #7631155 未加载
jamesbrownuhhabout 11 years ago
I must admit to not being entirely surprised that email addresses, when posted in public, get picked up by spammers. I would have liked to see answers to the harder questions - e.g. Here are email addresses that we&#x27;ve only given to banks or other large companies, now let&#x27;s see where the leaks are and investigate them.
privongabout 11 years ago
It would be interesting to also look at harvesting from PGP keys which have been posted to keyservers. I&#x27;m sure that&#x27;s a small portion of the population, but I wonder if it is being (ab)used.