I have built APIs in the Finance realm with 100% uptime. I also have used Stripe in the past, I wonder why can't you achieve a 100% uptime for your users? Are there regulatory constraints that prevent you from designing such a system?<p>You could break up your transaction API into two parts - a front facing API that simply accepts a transaction and enqueues it for processing and one that actually performs the transaction in the background. The front facing API should have low complexity and rarely change. It can persist transactions in a KV store like Cassandra to maximize availability.<p>The backend API that performs the transaction can have higher complexity and can afford to have lower availability. From the client's perspective, you could either respond immediately (HTTP 200) or with accepted (HTTP 202). In either case the client will be happier than the transaction failing outright.<p>I am sure your engineers have put in a lot of thought to designing this system but 24 minutes of downtime is unacceptable in the Finance domain unless you expect your users to retry failed transactions which beats the point of using Stripe.<p>Edit: Can someone explain why am I being downvoted? Rather than downvoting, can you provide arguments that make sense?