I edit a database newsletter – <a href="https://dbweekly.com/" rel="nofollow">https://dbweekly.com/</a> – so tend to always have my eyes out for new releases, what's coming along, and what not. And I thought I'd share a few more things that have jumped out at me recently in case anyone's in the mood for spelunking.<p>1. QuestDB – <a href="https://questdb.io/" rel="nofollow">https://questdb.io/</a> – is a performance-focused, open-source time-series database that uses SQL. It makes heavy use of SIMD and vectorization for the performance end of things.<p>2. GridDB - <a href="https://griddb.net/en/" rel="nofollow">https://griddb.net/en/</a> - is an in-memory NoSQL time-series database (there's a theme lately with these!) out of Toshiba that was boasting doing 5m writes per second and 60m reads per second on a 20 node cluster recently.<p>3. MeiliSearch - <a href="https://github.com/meilisearch/MeiliSearch" rel="nofollow">https://github.com/meilisearch/MeiliSearch</a> – not exactly a database but basically an Elastic-esque search server written in Rust. Seems to have really taken off.<p>4. Dolt – <a href="https://github.com/liquidata-inc/dolt" rel="nofollow">https://github.com/liquidata-inc/dolt</a> – bills itself as a 'Git for data'. It's relational, speaks SQL, but has version control on everything.<p>TerminusDB, KVRocks, and ImmuDB also get honorable mentions.<p>InfoWorld also had an article recently about 9 'offbeat' databases to check out if you want to go even further: <a href="https://www.infoworld.com/article/3533410/9-offbeat-databases-worth-a-look.html" rel="nofollow">https://www.infoworld.com/article/3533410/9-offbeat-database...</a><p>Exciting times in the database space!
I’ve always felt it strange that in almost every job I’ve had databases have been one of the most important pieces of the architecture but the least debated. I’ve spent hours debating languages and frameworks, but databases always come down to whatever we have a license for/what others at the company are using. Engineering teams will always say they make sure to use the right tool for the job, but no one ever talks about if it’s right to keep using the same database for a new product.
Materialize is neat, but there are other database systems that refresh at least some materialized views on the fly, while being smart about not rebuilding the entire view every time. See for example, Oracle, where FAST REFRESH ON COMMIT does most of what Materialize is advertised as doing, at least for views which that feature can support (restriction list here: <a href="https://stackoverflow.com/questions/49578932/materialized-view-in-oracle-with-fast-refresh-instead-of-complete-dosnt-work" rel="nofollow">https://stackoverflow.com/questions/49578932/materialized-vi...</a> ). Mind you, this comes with Oracle's extremely hefty price tag, so I'm not sure I'd recommend it to anyone who isn't already stuck with Oracle, but it is technical precedent.<p>It would be interesting to compare notes, and see what Materialize does better.
I'm working on Redis adapter for DynamoDB - Dynamo is really a distributed superset of Redis, and most of the data structures that Redis has scale effectively to the distributed hash table + B-Tree-like system that Dynamo offers. Having a well known and understood API like Redis is a boon for Dynamo, whose API is much more low level and esoteric.<p>The Go library is in beta, working on a server that's wire compatible with Redis.<p><a href="https://dbproject.red" rel="nofollow">https://dbproject.red</a><p><a href="https://github.com/dbProjectRED/redimo.go" rel="nofollow">https://github.com/dbProjectRED/redimo.go</a>
I really wish one of the existing db technologies, Firebird, got a shot in the arm. It has both embedded and server modes which makes it unique as far as I know. Also the database is a single file which with firebirds "careful write" methodology remains consistent at all times so while you can make a backup at any time because it has MVCC, even a file copy of the database file with open transactions should not be corrupted. The installer size comes in under 10 MB. It's being actively improved, is open source with a very liberal licence but sadly it only gets a tiny fraction of the attention that SQLite, postgres etc receive
My understanding of TileDB is that it is 100% client-side. There is no server. In a sense it’s like handling orc or paraquet or even SQLite files on S3, (except tiledb are fancy r-trees) with a delta-lake-like manifest file for transactions too.<p>I think in the future there’s going to be a sine-wave of smart-clients consuming S3 cleverly, and then smartness growing in the S3 interface so constraints and indices and things happen in the storage again, and back and forth...
I support FoundationDB's approach to databases which is basically provide a consistent, distributed, and ordered Key-Value store then you can build whatever type of database you need on top of it whether that's RDBMS, Document, Graph, etc.<p>With that said, CouchDB 4.0 (on FDB) is going to be killer. Master-Master replication between clients and server with PouchDB is phenomenal when you remove the complicated eventual consistency on the server side.<p>And as a plug, I'm building a multi-tenant/multi-application database on top of it.
I've found databases fascinating and tried various DB's as they come out.<p>I always find some issue or caveat or problem and I decide in the end that Postgres gets most of the way there anyway and I return to Postgres.<p>Whenever I get tempted by a shiny new database I remind myself "don't bet against Postgres".
Software is advancing so fast. Interesting to constantly reconsider the things I consider myself ahead of the curve on vs behind the curve on. Prisma looks great so I've updated my I want functional dbs, not ORMs post: <a href="https://github.com/ericbets/erics-designs/blob/master/funcdb.md" rel="nofollow">https://github.com/ericbets/erics-designs/blob/master/funcdb...</a>
> What I have yet to see but always secretly wanted, however, is a database that natively supports incremental updates to materialized views. Yep, that’s right: Materialize listens for changes in the data sources that you specify and updates your views as those sources change.<p>This is precisely one of the features that make ClickHouse shine
Does anybody know of a good educational resource on software/best practices that is kept up to date. Ideally something that does not include the latest bleeding edge but things that are battle hardened or getting there. Something that includes open source and commercial software would be ideal.
This is only tangentially related, but I rediscovered an old project of mine from years ago today and am rather excited about it:<p><a href="https://github.com/skorokithakis/goatfish" rel="nofollow">https://github.com/skorokithakis/goatfish</a><p>It's basically a 200 line document database in Python that's backed by SQLite. I need to store a bunch of scraping data from a script but don't want a huge database or the hassle of making SQLite tables.<p>Goatfish is perfect because it stores free-form objects (JSON, basically), only it lets you index by arbitrary keys in them and get fast queries that way.<p>It's pretty simple, but a very nice compromise between the simplicity of in-memory dicts and the reliability of SQLite.
>What I’m really hoping for is the emergence of extremely “hackable,” resolutely non-monolithic DBs that provide a plugin interface for highly use-case-specific data types,<p>Isn't this basically what FoundationDB is?
Interesting notes but I feel like the db itself has been commoditised and the battle is elsewhere now.
So anyone building a database engine today, will find out that to make it sustainable they also need an ecosystem on top of it, tooling, community, paid support, active devs, consultants (for which they may have no runway)
Finally I find anything that calls itself a database and uses S3 as a backend a bit ridiculous. S3 has eventual consistency so you can’t do the operations that differentiate a database from a file system.
"What I have yet to see but always secretly wanted, however, is a database that natively supports incremental updates to materialized views"<p>SQL Server ala 10+ years ago enters the discussion.
I think that the graphql adapters like hasura and goke are also an important innovation, for small mvp projects you can create a graphql api to query your database directly from the frontend, this reduces the development time by a factor of 2 at least.
I'll just throw a note for a new product, AWS's QLDB. It's an internal managed product that combines a replicated, immutable, versioned document database with ACID transactions and an immutable, provable history of every modification. There's some streaming and subset SQL on the back end.<p>Something this focused should have a few applications where bit level auditability matters, eg financial, chain of events, etc. Of course it comes with some tradeoffs vs a relational or kv db.<p>I wonder if there would be room for a self-hosted clone?
ClickHouse also supports incremental streaming from Kafka into a materialized view.<p>You can even detach and reattach the view from its backing table.
Hi Luc,<p>What's your perspective on predictive databases like <a href="https://aito.ai" rel="nofollow">https://aito.ai</a>?<p>I'm one of the Aito.ai founders. If you would like to hear more, I'm happy to talk one-to-one.<p>Regards,
Antti
I love these kinds of posts. They're targeted towards what people are finding interesting and they're highly tech related. It's a great way to find new technology.
The website didn't load for me. So here it is: <a href="https://web.archive.org/web/20200615193041/https://lucperkins.dev/blog/new-db-tech-1/" rel="nofollow">https://web.archive.org/web/20200615193041/https://lucperkin...</a><p>Also, I'd like to add one database to the list (I work there for 3 weeks now): TriplyDB [0]. It is making linked data easier.<p>Linked data is useful for when people of different organizations want a shared schema.<p>In many commercial applications one wouldn't want this, as data is the valuable part of a company. However, scientific communities, certain government agencies and other organizations -- that I don't yet know about -- do want this.<p>I think the coolest application of linked data is how the bio-informatics/biology community utilizes it [1, 2]. The reason I found out at all is because one person at Triply works to see if a similar thing can be achieved with psychology. It might make conducting meta-studies a bit easier.<p>I read the HN discussions on linked data and agree with both the nay sayers (it's awkward and too idealistic [4]) and the yay sayers (it's awesome). The thing is:<p>1. Linked data open, open as in open source, the URI [3] is baked into its design.<p>2. While the 'API'/triple/RDF format can be awkward, <i>anyone</i> can quite easily understand it. The cool thing is: this includes non-programmers.<p>3. It's geared towards collaboration. In fact, when reading between the lines, I'd argue it's really good for collaboration between a big heterogeneous group of people.<p>Disclaimer: this is my own opinion, Triply does not know I'm posting this and I don't care ;-) I simply think it's an interesting way of thinking about data.<p>[0] triply.cc<p>[1] A friend of mine once modeled some biochemistry part of C. Elegans from linked data into petrinets: <a href="https://www.researchgate.net/publication/263520722_Building_Executable_Biological_Pathway_Models_Automatically_from_BioPAX" rel="nofollow">https://www.researchgate.net/publication/263520722_Building_...</a><p>[2] <a href="https://www.google.com/search?client=safari&rls=en&q=linked+data+and+biology&ie=UTF-8&oe=UTF-8" rel="nofollow">https://www.google.com/search?client=safari&rls=en&q=linked+...</a> -- I quickly vetted this search<p>[3] I still don't know the difference between a URI and URL.<p>[4] I think back in the day, linked data idealists would say that all data should be linked to interconnect all the knowledge. I'm more pragmatic and simply wonder: in which socio-technological context is linked data simply more useful than other formats? My current very tentative answer is those 3 points.