TL;DR: I recently starting getting into machine learning and I’m building a bot to filter out startup engagement emails for fun and need training data.<p>--<p>I don’t know about you, but I’m sick of getting emails from startups/sass when I sign up “checking in” with me.<p>I did go through a period of replying bluntly with "Please, for the love of god stop emailing me.” or sending them ical links which actually take them to a rick roll, but I’ve decided I’m going to attempt to solve the problem once and for all with ML!<p>I would love some help though, if you have any emails like this and want to help, post them below or forward them to alexbarlowis [at] gmail.com<p>Thanks
It's probably worth noting that email you sign up to receive is not spam.<p>If you don't think that receiving four emails in exchange for a free month of a $60/month SaaS product is a fair deal, your options include clicking the "Unsubscribe" link on that mail or not signing up for those mails in the first place.<p>Penalizing the company by marking the emails you asked them to send you as spam seems like a mean thing to do.
There's a much easier solution: the vast majority of legit auto-generated email will have a "List-Unsubscribe" MIME header. I filter based on that header into a 'Bulk' folder.<p>There are some exceptions, e.g. emails that go via the Drip service, but given limited number of automated email services you can probably catch 99% of emails with a few rules, rather than machine learning.
I understand you're primarily interested in training data here but I should ask.<p>Is this a problem? You're talking here about startups/saas you have signed up for sending you spam? I would expect most people don't sign up for too many of these and it's easy enough to click the unsubscribe link, would be worse to miss an email you appreciated getting.