I wish there are authoritative books or papers on how to build object stores like S3 or software-defined storage in general. It looks object stores in public clouds have been so successful that few companies or research groups have been working on their own. Yes, I'm aware that we have systems like MinIO and Ceph, but so many questions are left unanswered, like how to achieve practically unlimited throughput like S3 does, like what kind of hardware would be optimal for S3's workload, like how to optimize for large-scan incurred by analytics workload, which S3 is really good at; like how to support strong consistency like S3 does without impacting system performance visibly even though S3 internally must have metadata layer, storage layer, and an index layer, like how to shrink or expand clusters without impacting user experience, like how to write an OSD that squeezes out every bit of hardware performance (vitastor claims so, but there's not many details), and the list goes on.
“A free book” aka a book by a database vendor that wants to skew the premise of this-is-just-a-discussion-on-performance-at-scale-vendor-agnostic to that very vendor. Nothing is free. It’s all marketing.
Beautiful. Been aiming to learn how to scale MySQL databases so I can run my apps on VMs without having to used managed dbs like Aurora or Azure Managed Database
Why don’t more companies / startups choose ScyllaDB rather than Postgres or MySQL?<p>ScyllaDB is a C++ version of Cassandra so it looks like speed and scalability is a complete advantage over the Java based Cassandra and Discord is using ScyllaDB at scale too.
Haven't have time to look over it. For people who did, is this a generic 'Database performance' book or a longform pamphlet for ScyllaDB ?
Im not sure how this qualifies as open source when the repo for the book[0] is essentially empty?<p>[0] <a href="https://github.com/Apress/db-performance-at-scale">https://github.com/Apress/db-performance-at-scale</a>
Is it just me, or is the <i>basic</i> knowledge of database normalization and indexing kind of a lost art these days?<p>How many teams out there just "add cache and more hardware" when no one ran an EXPLAIN on that one core query in a giant table?<p>Perhaps we should do that before talking about anything "at scale".
> Noticing that a certain NoSQL database was recently
trending on the front page of Hacker News, Patrick picked it for his backend stack<p>I feel attacked! :D