TechEcho

7 comments

danpalmeralmost 2 years ago

A good writeup, but quite shocking that this managed to happen in the first place. I'd have expected that an email service provider would have very good monitoring on deliverability and failure reasons on both sending and receiving, and that something like a cloud migration would be done very incrementally to ensure no loss of service.For this particular issue I would have expected some or all internal email at HEY! to be moved before any customers so that the new system could be tested.Email is notoriously finicky when it comes to networks, IPs, the cryptography involved, and all sorts of details that are in flux during a cloud migration, and it's also notorious for being difficult to recover from if you accidentally get your email listed in denylists.

dpcxalmost 2 years ago

I'm glad that they posted a "miss" - but this reads over and over like a sales pitch:- I created a card in <X> Basecamp - Someone posted a message in Campfire - We have our own encryption - Another message posted in a different Campfire - Oh, this one uses custom categories! - Todo's in Basecamp projectI get it, 37signals dogfoods their system. What we don't normally see from other posts is that person/company X posted in slack and made a ticket in jira and then created a todo on their trello board.Maybe I'm being too cynical...

评论 #36142539 未加载

评论 #36141343 未加载

llm_nerdalmost 2 years ago

I'm a little surprised this was published. It is hard to sound charitable when writing something like this but it was such a trivial, obvious fault (moving an email system and then SPF starts failing) that normally things like this are embarrassingly swept under the rug. Generally that is probably the best path.While I appreciate the transparency and it's a great write-up, at the same time somehow I leave the post with a worse opinion of 37signals.

LeonMalmost 2 years ago

> Senior SRE Paul Shuvashish first noticed that these emails weren’t failing DKIM but SPF. [...] This pointed out a flaw in our application-level analysis system: we were assimilating DMARC errors – which can be either because of SPF or DKIM – to DKIM errors. So while the app was doing the right thing nevertheless – marking the email as spam – the insight it was collecting internally was misleading.I don't agree with 'the app was doing the right thing' here: for DMARC alignment (a DMARC pass) you need SPF or DKIM alignment. One of the two is enough.So an email from a domain with DMARC enabled that passes DKIM, but fails SPF should pass. The application should not have rejected the email based on SPF, when it was actually DKIM aligned.

lijokalmost 2 years ago

Fixed an SPF issue by mucking around with SNAT rules. I think this is not the last time we'll see HEY's emails going to spam.

wordyskeletonalmost 2 years ago

As someone that works in a team with minimal collaboration software overhead—is there a ton of bloat in their process (Basecamp this, Campfire that, etc.) or is that just the reality of modern software development?

评论 #36140646 未加载

AJRFalmost 2 years ago

> And this is not just a guideline; we built a new encryption technologyBut...the old adage!

7 comments

danpalmeralmost 2 years ago

dpcxalmost 2 years ago

评论 #36142539 未加载

评论 #36141343 未加载

llm_nerdalmost 2 years ago

LeonMalmost 2 years ago

lijokalmost 2 years ago

Fixed an SPF issue by mucking around with SNAT rules. I think this is not the last time we'll see HEY's emails going to spam.

wordyskeletonalmost 2 years ago

评论 #36140646 未加载

AJRFalmost 2 years ago

> And this is not just a guideline; we built a new encryption technologyBut...the old adage!

A Friday Email Incident

7 comments

A Friday Email Incident

7 comments