IBM Cloud was down, as well as their status page

268 pointsby whyleymalmost 5 years ago

32 comments

My side project StatusGator monitors status pages (including IBM's ill-fated page) and I'm seeing more than 10% of the nearly 800 services we monitor having an outage right now.So it appears to affect anyone who depends on IBM Cloud.

评论 #23473647 未加载

评论 #23472126 未加载

ComputerGurualmost 5 years ago

So what are HNers using IBM Cloud for and where do you see that it has an edge over AWS offerings (where an overlap exists, obviously)?(I figure either you’re in devops and you are putting out fires too busy to read this thread or you’re not and your work is halted because of the incident so you might have time to read and reply ;)

评论 #23472654 未加载

评论 #23472418 未加载

评论 #23473364 未加载

评论 #23472390 未加载

评论 #23473537 未加载

评论 #23472505 未加载

评论 #23472367 未加载

评论 #23475249 未加载

评论 #23472320 未加载

评论 #23472953 未加载

blantonlalmost 5 years ago

All of Broadcastify's audio servers (hosted with Softlayer in their Dallas datacenter) are completely unreachable and down.I'm going to wait a bit to see if we get a status update, otherwise we'll be spinning up instances on AWS to failover (which will be enormously costly for bandwidth)No status, no nothing, we're in the dark.

评论 #23472113 未加载

Fordecalmost 5 years ago

I remember I was at an IBM sponsored hackathon around 2015 where it was a requirement to use Bluemix. Over the course of the weekend the service went down for hours 3 times.Literally this morning I was wondering what ever happened to it, like did it die a quiet death? Oh it rebranded to IBM cloud in 2017. Now this news.I think there's an eponymous law named for this sort of thing.

评论 #23472350 未加载

评论 #23472347 未加载

vmh1928almost 5 years ago

In the Cloud Status History page scroll down to the 6:32 entry that says "Unable to Access IBM Cloud"<a href="https://cloud.ibm.com/status?selected=history" rel="nofollow">https://cloud.ibm.com/status?selected=history</a>- 2020-06-10 02:19 UTC - RESOLVED - The network operations team adjusted routing policies to fix an issue introduced by a 3rd party provider and this resolved the incident

voz_almost 5 years ago

I generally do everything on AWS or GCP, with a little Azure sometimes for personal projects. In what world does IBM beat one of those three in anything? Generally curious - how they are able to stay competitive?

评论 #23472110 未加载

评论 #23473136 未加载

评论 #23472042 未加载

评论 #23472088 未加载

blazefox69almost 5 years ago

Fixed it for you <a href="https://github.com/ibm-cloud-docs/overview/pull/74" rel="nofollow">https://github.com/ibm-cloud-docs/overview/pull/74</a>

caiobegottialmost 5 years ago

Honest slightly cynical question: most probably someone inside the responsible team said some day that it would be very stupid to host the status page inside the same infrastructure being monitored, but they were probably ignored... what should that person do now? Say "toldya!" out loud in the postmortem meeting or simply shut up and move on because reality is that we are hired to do some stupid task and not to think for ourselves?

评论 #23472281 未加载

评论 #23472308 未加载

评论 #23472272 未加载

评论 #23472247 未加载

评论 #23472164 未加载

评论 #23472283 未加载

评论 #23472441 未加载

评论 #23472292 未加载

评论 #23472237 未加载

评论 #23473023 未加载

评论 #23476593 未加载

Lyrenalmost 5 years ago

I received communication ~15min ago that they're actively looking into the issue. I submitted the ticket roughly 20min ago. So it seems they're aware.It doesn't help that their status page is also hosted on IBM Cloud.

whyleymalmost 5 years ago

Found this from a user on Twitter - "Our status page for IBM Aspera is on StatusPage, so you can track here as a bank shot: <a href="https://status.aspera.io" rel="nofollow">https://status.aspera.io</a> "

gatvolalmost 5 years ago

Well if they cannot foresee this eventuality, what else are they missing under the hood?

julianeonalmost 5 years ago

Seems pretty dumb to host a status page in a way that it could go down, when it should be a static page that is trivially hosted on CDN's worldwide.

评论 #23472197 未加载

评论 #23472332 未加载

评论 #23472451 未加载

sky_rwalmost 5 years ago

The most infuriating thing about this is the ZERO communication coming out of IBM Cloud. No emails. No updates to twitter. Status page down. Support lines clogged.At least give me something I can point my customers at to show them this is not due to my incompetence.

评论 #23472375 未加载

shaabanbanalmost 5 years ago

Also still no communication from IBM that anything is wrong.

评论 #23472020 未加载

akerroalmost 5 years ago

Haha, amazon had the same problem a few years ago when they had fire in datacenter, their status checker page was hosted in the same building and was showing everything is fine, while 1000s of websites hosted on AWS were down.

shaabanbanalmost 5 years ago

wonder if we'll ever get a post-mortem about this... Seems to be global

评论 #23472009 未加载

评论 #23471891 未加载

thephyberalmost 5 years ago

How sure are we that this outage is limited to IBM cloud?Pindom[1] had a spike of website outages from 11k => 27k.[1] <a href="https://livemap.pingdom.com/" rel="nofollow">https://livemap.pingdom.com/</a>

评论 #23472071 未加载

AaronFrielalmost 5 years ago

Ah, is this the exception that proves the rule that "no one was ever fired for buying IBM?"Sorry to be glib, I'm sure it's a tough time for people who were sold on their cloud platform and work on it!

评论 #23472226 未加载

Operylalmost 5 years ago

Yup .. hit us pretty badly. Our account manager doesn't know either.

homegluealmost 5 years ago

I've seen multiple services get affected this morning including Sendgrid, Nexmo and Up bank, all at the same time. Wondering if this is related.

leetroutalmost 5 years ago

Hugops.Hope they get a root cause and a quick fix. I’m not a fan of their cloud service but I know people working on the outage and fix are stressed.

kittehalmost 5 years ago

About a month ago their Northern Virginia region was down. All the BGP prefixes associated with it disappeared from the internet (routes withdrawn). This time (I went to check when someone mentioned it) they kept advertising, but all traffic went nowhere once it got into their network. Curious to see if there is an RFO released.

评论 #23478218 未加载

noninesalmost 5 years ago

This looks related (smoking gun?) <a href="https://status.aspera.io/incidents/t9r03x71dxkl" rel="nofollow">https://status.aspera.io/incidents/t9r03x71dxkl</a>>> A 3rd party network provider was advertising routes which resulted in our WW traffic becoming severely impeded.

评论 #23475947 未加载

stevehawkalmost 5 years ago

guess they didn't learn from AWS and hosting their status pages (in particular their icons) in S3

bantecalmost 5 years ago

It’s a second significant issue for last year with IBM( absolutely inconsistent for critical infrastructure (we are FinTech)

cerwalmost 5 years ago

Been like that for last 1h, Network packet Sydney (GCP) to Sydney (IBM) 62% packet loss

ck2almost 5 years ago

even weather.com was down but someone broke ebay too<pre><code> Fastly error: unknown domain: www.ebay.com. Please check that this domain has been added to a service.</code></pre>

评论 #23473105 未加载

评论 #23474459 未加载

pmarreckalmost 5 years ago

Imagine hosting your status page on a different domain

评论 #23472695 未加载

nadavamialmost 5 years ago

It seems like the status page just came back up.

woakasalmost 5 years ago

Our site (ubidots.com) does not have a complete down, but the IBM network has a high latency.

someguy12321almost 5 years ago

heads be rolling tomorrow!

anon102010almost 5 years ago

A quick check of cloudflare's isbgpsafeyet pageIBM Cloud - unsafeAt least AWS signs their routes I think.If you can't even sign your own routes - hard to have a ton of pity.

评论 #23474768 未加载

32 comments

colinbartlettalmost 5 years ago

评论 #23473647 未加载

评论 #23472126 未加载

ComputerGurualmost 5 years ago

评论 #23472654 未加载

评论 #23472418 未加载

评论 #23473364 未加载

评论 #23472390 未加载

评论 #23473537 未加载

评论 #23472505 未加载

评论 #23472367 未加载

评论 #23475249 未加载

评论 #23472320 未加载

评论 #23472953 未加载

blantonlalmost 5 years ago

评论 #23472113 未加载

Fordecalmost 5 years ago

评论 #23472350 未加载

评论 #23472347 未加载

vmh1928almost 5 years ago

voz_almost 5 years ago

评论 #23472110 未加载

评论 #23473136 未加载

评论 #23472042 未加载

评论 #23472088 未加载

blazefox69almost 5 years ago

Fixed it for you <a href="https://github.com/ibm-cloud-docs/overview/pull/74" rel="nofollow">https://github.com/ibm-cloud-docs/overview/pull/74</a>

caiobegottialmost 5 years ago

评论 #23472281 未加载

评论 #23472308 未加载

评论 #23472272 未加载

评论 #23472247 未加载

评论 #23472164 未加载

评论 #23472283 未加载

评论 #23472441 未加载

评论 #23472292 未加载

评论 #23472237 未加载

评论 #23473023 未加载

评论 #23476593 未加载

Lyrenalmost 5 years ago

whyleymalmost 5 years ago

gatvolalmost 5 years ago

Well if they cannot foresee this eventuality, what else are they missing under the hood?

julianeonalmost 5 years ago

Seems pretty dumb to host a status page in a way that it could go down, when it should be a static page that is trivially hosted on CDN's worldwide.

评论 #23472197 未加载

评论 #23472332 未加载

评论 #23472451 未加载

sky_rwalmost 5 years ago

评论 #23472375 未加载

shaabanbanalmost 5 years ago

Also still no communication from IBM that anything is wrong.

评论 #23472020 未加载

akerroalmost 5 years ago

shaabanbanalmost 5 years ago

wonder if we'll ever get a post-mortem about this... Seems to be global

评论 #23472009 未加载

评论 #23471891 未加载

thephyberalmost 5 years ago

评论 #23472071 未加载

AaronFrielalmost 5 years ago

评论 #23472226 未加载

Operylalmost 5 years ago

Yup .. hit us pretty badly. Our account manager doesn't know either.

homegluealmost 5 years ago

I've seen multiple services get affected this morning including Sendgrid, Nexmo and Up bank, all at the same time. Wondering if this is related.

leetroutalmost 5 years ago

Hugops.Hope they get a root cause and a quick fix. I’m not a fan of their cloud service but I know people working on the outage and fix are stressed.

kittehalmost 5 years ago

评论 #23478218 未加载

noninesalmost 5 years ago

评论 #23475947 未加载

stevehawkalmost 5 years ago

guess they didn't learn from AWS and hosting their status pages (in particular their icons) in S3

bantecalmost 5 years ago

It’s a second significant issue for last year with IBM( absolutely inconsistent for critical infrastructure (we are FinTech)

cerwalmost 5 years ago

Been like that for last 1h, Network packet Sydney (GCP) to Sydney (IBM) 62% packet loss

ck2almost 5 years ago

even weather.com was down but someone broke ebay too<pre><code> Fastly error: unknown domain: www.ebay.com. Please check that this domain has been added to a service.</code></pre>

评论 #23473105 未加载

评论 #23474459 未加载

pmarreckalmost 5 years ago

Imagine hosting your status page on a different domain

评论 #23472695 未加载

nadavamialmost 5 years ago

It seems like the status page just came back up.

woakasalmost 5 years ago

Our site (ubidots.com) does not have a complete down, but the IBM network has a high latency.

someguy12321almost 5 years ago

heads be rolling tomorrow!

anon102010almost 5 years ago

A quick check of cloudflare's isbgpsafeyet pageIBM Cloud - unsafeAt least AWS signs their routes I think.If you can't even sign your own routes - hard to have a ton of pity.

评论 #23474768 未加载