Memory safety is something that needs to be mentioned. I was integrating DuckDB into a project and ended up ripping it out after running into memory corruption issue in practice. Upon investigation they had a massive issue of fuzzer found bugs on their GitHub. While I am glad they are fuzzing and finding issues, I cannot ship that onto customer systems.<p>We have a few very good memory safe programming languages at this point. Please do not start a project in C/C++ unless you are truly exceptional and understand memory management and exploitation inside and out. I switched to SQLite on the project since it is one of the more fuzzed applications out there that fit the need. The next embeddable database I use (bonus if it works on cloud) will need to be in a memory safe language.
I've been building a Postgresql extension in the last months for some functionality that was needed and have learned a ton about the internal workings of this database. All very scary and complicated sounding stuff but I feel privileged to be able to do this because the things you learn are just pure gold. My attitude before this was that of the ideal customer of a cloud database, someone who was scared of sql and preferred to hide behind the complexity of a ORM. Not anymore, now I write thousands of lines of sql and laugh to myself like a maniac.
Is it just me, or does Ed Huang skip over the most important part of database design: <i>actually making sure the database has stored the data</i>?<p>I read to the end of the article, and while having a database as a serverless collection of microservices deployed to a cloud provider <i>might</i> be useful, it ultimately will be useless if this swarm approach doesn't give me any guarantees about how or if my data actually makes it onto persistent storage at some point. I was expecting a discussion of the challenges and pitfalls involved in ensuring that a cloud of microservices can concurrently access a common data store (whether that's a physical disk on a server or a S3 bucket), without stomping on each other, but that seemed to be entirely missing from the post.<p>Performance and scalability are fine, but when it comes to databases, they're of secondary importance to ensuring that the developer has a good understanding of when and if their data has been safely stored.
I would also add that the databases in 2020s will be written in Rust, rather than C/C++. The safety guarantees Rust provides makes the development process faster, as well as results is clean code that is easier to understand and extend.
I dont agree with the premise that running transactional and analytical workloads on the same database is architecturally “simpler”. In my experience this is only true at very low scale and those contexts are already sufficiently well served by existing database tech.
I'm working on a new Git backed file based Database for knowledge bases. Not designed for domains where you can confidently predict your schemas ahead of time but instead designed for use cases where you have large complex schemas that are frequently changing.<p>It simply wasn't possible a few years ago (SSDs weren't fast enough and Git was too slow for projects with huge number of files).<p>I'm having fun with it.<p>Current version is just written in Javascript but if there demand hits a higher level would likely write a version in Rust or Go.<p>If anyone has any pointers to similar projects I'm all ears.
Interesting read! I would like to add:<p>* databases need to get better yet at schema management and workload isolation to enable multiple applications to properly integrate through the database (as traditionally envisioned)<p>* HTAP seems inevitable but needs built-in support for row-level history/versioning to get maximum benefit<p>* databases should abstract over raw compute infrastructure efficiently enough that you don't need k8s to run your application logic and APIs elsewhere. The database should be a decent all-in-one place to build & ship stuff
I'm surprised there isn't a "serverless" PostgreSQL. That seems like it would get more bank for buck then writing a cloud native DB from scratch.<p>(Or maybe there is one but I don't know about it.)<p>AWS made a serverless MySQL.