Quick reminder from your friendly local SRE: never ever issue certificates that expire on weekends. Make certs expire in the middle of the afternoon on a business day wherever your operators live and work. The cert in question expires at May 30 10:48:38 2020 GMT, which smells suspiciously like a fixed time after the cert was generated, rather than at a well-chosen point in time.
This one bit me today and abruptly ended my day at the beach.<p>The certificate reseller advised my customer that it was okay to include the cross-signing cert in the chain, because browsers will automatically ignore it once it expires, and use the Comodo CA root instead.<p>And that was true for browsers I guess. But my customer also has about 100 machines in the field that use cURL to access their HTTPS API endpoint. cURL will throw an error if one of the certs in the chain has expired (may be dependent on the order, don't know).<p>Anyway, 100 machines went down and I had a stressed out customer on the phone.
Honestly, certificates should never expire or should expire daily. If certificate revocation works then its pointless to have expiring certs. Its just a mechanism for CAs to seek rent.<p>If certificate revocation doesnt work then certs need to expire super frequently to limit potential damage if compromised.<p>A certificate that expires in 20 years does absolutely nothing for security compared to a certificate that never expires. Odds are that in 20 years the crypto will need to be updated anyways, effectively revoking the certificate.
Andrew Ayer has a write-up about this at <a href="https://www.agwa.name/blog/post/fixing_the_addtrust_root_expiration" rel="nofollow">https://www.agwa.name/blog/post/fixing_the_addtrust_root_exp...</a><p>At the core, this is not a problem with the server, or the CA, but with the clients. However, servers have to deal with broken clients, so it’s easy to point at the server and say it was broken, or to point at the server and say it’s fixed, but that’s not quite the case.<p>I discussed this some in <a href="https://twitter.com/sleevi_/status/1266647545675210753" rel="nofollow">https://twitter.com/sleevi_/status/1266647545675210753</a> , as clients need to be prepared to discover and explore alternative certificate paths. Almost every major CA relies on cross-certificates, some even with circular loops (e.g. DigiCert), and clients need to be capable of exploring those certificates and finding what they like. There’s not a single canonical “correct” certificate chain, because of course different clients trust different CAs.<p>Regardless of your CA, you can still do things to reduce the risk. Using tools like mkbundle in CFSSL (with <a href="https://github.com/cloudflare/cfssl_trust" rel="nofollow">https://github.com/cloudflare/cfssl_trust</a> ) or <a href="https://whatsmychaincert.com/" rel="nofollow">https://whatsmychaincert.com/</a> help configure a chain that will maximize interoperability, even with dumb and old clients.<p>Of course, using shorter lived certificates, and automating them, also helps prepare your servers, by removing the toil from configuring changes and making sure you pickup updates (to the certificate path) in a timely fashion.<p>Tools like Censys can be used to explore the certificate graph and visualize the nodes and edges. You’ll see plenty of sites rely on this, and that means clients need to not be lazy in how they verify certificates. Or, alternatively, that root stores should impose more rules on how CAs sign such cross-certificates, to reduce the risk posed to the ecosystem by these events.
Great thread by Ryan Sleevi tracking the many (and growing) reports of issues caused by this root expiring: <a href="https://twitter.com/sleevi_/status/1266647545675210753" rel="nofollow">https://twitter.com/sleevi_/status/1266647545675210753</a><p>Top offender so far seems to be GnuTLS.
This issue is largely cause by people still stuffing old root certificates in their certificate chains, and serving that to their users.<p>As a general rule of thumb:<p>1) You don't need to add root certificates to your certificate chain<p>2) You especially don't need to add expired root certificates to the chain<p>For additional context and the ability to check using `openssl` what certificates you should modify in your chain, I found this post useful: <a href="https://ohdear.app/blog/resolving-the-addtrust-external-ca-root-certificate-expiration" rel="nofollow">https://ohdear.app/blog/resolving-the-addtrust-external-ca-r...</a>
This appears to have caused our Heroku managed apps to go offline for 70+ minutes.<p><a href="https://status.heroku.com/incidents/2034" rel="nofollow">https://status.heroku.com/incidents/2034</a><p>Anyone that was already connected was able to continue accessing the sites but new connections failed. This mostly affected web users.<p>Our main app server continued to crank along thankfully (also on Heroku) and that kept the mobile traffic going which is 90% of our users.<p>Edit: adding Heroku ticket link
I have never really wanted to go "serverless" until today.<p>TIL that I can buy a cert that expires in a year that is signed by a root certificate that expires sooner. Still not sure WHY this is the case, but this is definitely the case.
Yep. Got woken up early today for this. We renewed our cert about a month and two days ago. Namecheap, the vendor, sent us the bad AddTrust cert in the bundle. They weren't updating the bundles until two days after we renewed the cert.
DataDog failed this morning because of root CA issue.[0] Was a fun Saturday morning with 5000 alarms blowing up my phone.<p>[0] <a href="https://status.datadoghq.com/incidents/6bqpd511nj4h" rel="nofollow">https://status.datadoghq.com/incidents/6bqpd511nj4h</a>
Stripe Webhooks are currently failing "for some users" <a href="https://twitter.com/stripestatus/status/1266756286734938116" rel="nofollow">https://twitter.com/stripestatus/status/1266756286734938116</a> -- some chance that's related.<p>Edit: for <a href="https://www.circuitlab.com/" rel="nofollow">https://www.circuitlab.com/</a> we saw all Stripe webhooks failing from 4:08am through 12:04pm PDT today with "TLS error". Since 12:04pm (5 minutes ago), some webhooks are succeeding and others are still failing.<p>Edit 2: since 12:17pm all webhooks are succeeding again. Thanks Stripe!
I was wondering why Lynx started spouting some nonsense:<p><pre><code> $ lynx -dump https://wiki.factorio.com/Version_history
Looking up wiki.factorio.com
Making HTTPS connection to wiki.factorio.com
SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
Retrying connection without TLS.
Looking up wiki.factorio.com
Making HTTPS connection to wiki.factorio.com
SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
Alert!: Unable to make secure connection to remote host.
lynx: Can't access startfile https://wiki.factorio.com/Version_history</code></pre>
Are we going to experience the same bug next year for all LetsEncrypt certificates when the DST Root CA X3 expires? I guess modern devices could deal with LetsEncrypt issuing directly from their own modern ISRG Root X1, but would that leave legacy clients completely stranded (iOS <10, older versions of Windows and Android...?)
Some users on Safari (probably old versions) appear to be getting bad cert warnings for <a href="https://www.playsaurus.com" rel="nofollow">https://www.playsaurus.com</a>. REALLY glad I found this post here, it was driving me nuts.
CloudAMPQ (managed RabbitMQ) was affected: <a href="https://status.cloudamqp.com/" rel="nofollow">https://status.cloudamqp.com/</a><p>Caused us some connections issues that required a restart of both our clients and the rabbitmq cluster.
This just hit me via Debian's 'apt-get update': I'm using jitsi's package repository which is hosted via HTTPS and seems to rely on the expired root-CA. Certificate checks started failing for everybody a few hours ago [1].<p>That's quite bad, as I tried to do a clean re-install of jitsi-meet, and now I have no installation at all any more.<p>[1] <a href="https://github.com/jitsi/jitsi-meet/issues/6918" rel="nofollow">https://github.com/jitsi/jitsi-meet/issues/6918</a>
A bit of an aside, but<p><i>While Android 2.3 Gingerbread does not have the modern roots installed and relies on AddTrust, it also does not support TLS 1.2 or 1.3, and is unsupported and labelled obsolete by the vendor.</i><p><i>If the platform doesn’t support modern algorithms (SHA-2, for example) then you will need to speak to that system vendor about updates.</i><p>I find things like that really really irritating. Crypto is basically maths, and a very pure form at that, so should be one of the most portable types of software in existence. Computers have been doing maths since before they were machines. Instead, the forced obolescence bandwagon has made companies take this very pure and portable technology and tied it to their platform's versions, using the "security" argument to bait and coerce users into taking other unwanted changes, and possibly replacing hardware that is otherwise functional (and, as mentioned earlier, is perfectly capable of executing the relevant code) along with all the ecological impact that has. Adding new root certificates at least for PCs is rather easy due to their extreme portability, but I wish the same could be said of crypto algorithms/libraries.
Thankfully our uptime services spotted this earlier in the week. I'm terrible with certs, so no idea why a cert we brought this year is even using this root ca.
To be honest, things like let's encrypt or cloud services which manage ssl is a great help
Fairly certain this affected Kroger. My sister called me this morning asking to troubleshoot why her laptop was warning of an unsecured connection.<p>Perhaps a coincidence, but also likely that their cert expired.
We had our CI systems fail today because of this. They were running Ubuntu 16.04. Check the below thread, they say an openssl bug is also a contributing factor. Removing the expired root CA fixed the issue for me. (edit: removed from the clients)<p><a href="https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_ca_expiring_may_not_be_handled_well/" rel="nofollow">https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_...</a>
Everything is fine with PKI and SSL certificates. It was a bug in OpenSSL 1.0.1 / 1.0.2 in dealing with two times cross-signed root CA. It is fixed in 1.1.1, but these older versions are still default on RHEL6/RHEL7/Centos6/Centos7 and even Ubuntu16.04.<p>I think a large portion of online communications have been affected today.
We had to get an entirely new certificate to resolve this. We had recently migrated our docker images to be based on Amazon Linux 2, and low a behold, there was no easy way we found to upgrade to the required version of OpenSSL on Amazon Linux 2. Was easier to just upgrade our certificates
I've maintained some high-level notes on this event, problems and fixes here: <a href="https://gist.github.com/minaguib/c8db186af450bceaaa7c452b76a9901b" rel="nofollow">https://gist.github.com/minaguib/c8db186af450bceaaa7c452b76a...</a>
ip-api.com was also affected by this.
After our first alert at 10:49 (cert expired at 10:48:38) and a minute of being puzzled as to why our certificate expired, we realized the root we bundled is the issue. We finished updating our primary API servers at 10:55.
The amount of times I told the CA that this will be an issue is a lot. And the amount of time they replied saying there will be no issue is every single time. Dam I hate CAs like Comodo
Surely the current CA paradigm shouldn't continue to be accepted by the people who keep infrastructure running anymore?<p>We need to do something.