I was hoping to build a simple relational database as a side project, focussing mainly on learning how the internals and the algorithms used work, as I never plan to make this a published product of any sort.<p>So far I have the CMU Advanced DB course (https://15721.courses.cs.cmu.edu/spring2024/) and the Database Internals book.<p>While I'm learning a lot about how databases work, I have no clue how to start writing my own, so I was wondering if there were any resources for building a relational database, I've only found some for KV Stores. Hopefully something less intimidating to get started than having to read SQLite code.
Comes up a fair bit in Ask HN - <a href="https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=ask%20database%20implemenation&sort=byPopularity&type=story" rel="nofollow">https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...</a><p>You can widen the search a bit by taking out 'implementation' and trying some other terms like 'book', 'internals', etc.
You may check <a href="https://cstack.github.io/db_tutorial" rel="nofollow">https://cstack.github.io/db_tutorial</a> which teaches writing an SQLite compatible database from scratch in C.<p>I know you mentioned about RDBMS, but may I introduce you to a structured path for building a KV Store, which can be a foundation for a RDBMS? My project is in TDD fashion with the tests. So, you start with simple functions, pass the tests, and the difficulty level goes up. When all the tests pass, you will have written a persistent key-value store.<p><a href="https://github.com/avinassh/py-caskdb">https://github.com/avinassh/py-caskdb</a>
<a href="https://www.youtube.com/@CMUDatabaseGroup" rel="nofollow">https://www.youtube.com/@CMUDatabaseGroup</a><p>They publish their latest course videos every year, during the year. Andy Pavlo is highly knowledgeable about the field.
"This course is a comprehensive study of the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. The course is appropriate for graduate students in software systems and for advanced undergraduates with dirty systems programming skills. "<p>That class drops a few buzz words in its advert: OLAP and dirty!<p>What sort of "simple" RDBMS are you envisioning that is different from the current lot?
I'd have a look at DuckDB as well, looks like they're doing a great job with their useful, practical and successful innovation and a ton of interesting differentiating design decisions; I hear that's on top of SQLite, is that right? They must have a fair amount of code of their own regardless.<p>Then there's also some projects who have tried to port or re-create SQLite in Rust.
You want our Intro DB Systems course not the Advanced one:<p><a href="https://15445.courses.cs.cmu.edu" rel="nofollow">https://15445.courses.cs.cmu.edu</a><p>Lectures start next month. Or you can watch previous years. Learn to walk before you run.
I was going to suggest the SQLite source code.<p>One could probably go quite a ways in bare python with lists of dataclasses and pickles, never mind the performance.<p>That's your backend.<p>Then you might find some prior art in the way of a SQL parser for a front end.