Disclosure (I used the right word!): I work at AWS. Opinions are my own.<p>This is a nice article with some fun comparisons and important details around the specifics of operating TiDB and systems like it on AWS. If I were running a replicated database such as TiDB, hypothetically of course, I'd probably opt for an instance type with locally attached storage. EBS volumes are already replicated themselves for durability so if you're running a replicated DB on EBS volumes then you're going to be replicating while you're replicating. Yo dawg. That's going to be some extra latency, even with a majority quorum replication protocol like Raft.<p>Historically if you wanted locally attached SSDs the more expensive storage instance types like i2/i3/i3en were all you had but now there's m6gd, c6gd, and r6gd[0]. Lots of options at all sorts of price points for your workload. The Graviton 2 based m6gd instances with local NVMe SSDs are more cost effective than the m5 instances that the blog post is using, which do not have a local SSD.<p>As the post calls out, if you are going to use EBS then use it for the materialized key-value store portion. Maybe even make a little Step Function to snapshot the EBS volumes to figure out where you can safely trim the Raft log at. It could be nice.<p>0 - <a href="https://aws.amazon.com/about-aws/whats-new/2020/07/announcing-new-amazon-ec2-instances-powered-aws-graviton2-processors/" rel="nofollow">https://aws.amazon.com/about-aws/whats-new/2020/07/announcin...</a>
A less clickbait-y title would be, "How to Run Our Database on AWS at a Reasonable Cost".<p>"Reasonable Cost" comes from the conclusion, and I omitted "Better Performance" since they never compare to alternatives (although they do mention Aurora).
The part that's not mentioned is that there's no real latency guarantee on these IOPS. If you're running a database that has to do a few data dependent reads (eg. walking a B tree), then these can add up quickly if latency spikes to dozens of milliseconds (eg. this datadog blog post [0]).<p>[0] <a href="https://www.datadoghq.com/blog/aws-ebs-latency-and-iops-the-surprising-truth/" rel="nofollow">https://www.datadoghq.com/blog/aws-ebs-latency-and-iops-the-...</a>
I’m not sure how this could be cheaper than RDS...? Nor do I see how the complexity of this solution could be justified except in the most performance sensitive situations.
I'm quite surprised, by looking at the post date, they didn't use the new Graviton2 instances.
I'm afraid this post is born already obsolete on this matter.