I had gone through a survey course while in grad school a few years ago. Riak, HBase, CouchDB were the shiny new things. I kinda lost track of things after school, but want to check back in again on what is the latest.<p>What is the current state of art? Is there a book that I can read up on this or better yet, some academic course/offering that covers this? Mostly looking for what design decisions/algorithms/data structures used by the databases. Is the Klepmann book (DDIA) slightly out of date now or still very much relevant?<p>Thank you!!
The big change is Jepsen, <a href="https://jepsen.io" rel="nofollow">https://jepsen.io</a><p>CAP tradeoffs are better documented.<p>And there is more to go on than marketing claims.<p>Also, SQL is the new NoSQL.
For internals and technical aspects, checkout CMU's Database Systems lectures on Youtube. They also invite developers from new databases to explain their main ideas.
Check out NDB Cluster (Rondb) and the blog of the creator, example of a thread pipeline to be more efficient than Redis/ScyllaDB per-core sharding: <a href="http://mikaelronstrom.blogspot.com/2021/03/designing-thread-pipeline-for-optimal.html" rel="nofollow">http://mikaelronstrom.blogspot.com/2021/03/designing-thread-...</a>
`snowflake` is pretty good MPP database. Buts it's a managed service.<p>Few advantage over traditional MPP<p>1. You can clone prod DB for testing with no additional cost.<p>2. Time travel. No need to take manual back.<p>3. Good integration with AWS S3<p>4. Can scale horizontally and vertically on demand
YouTube runs an in-house database called Procella. Its feature set is pretty amazing. Some of the devs behind it came from the Hadoop world. Google published a paper on its architecture.<p>Klepmann's book is still a good read. A lot of the concepts are evergreen.