7 Databases in 7 Weeks for 2025

404 pointsby yarapavan5 months ago

21 comments

DuckDB is really having a momentThe ecosystem is very active, and they have recently opened up "community extensions" to bring your own functions, data types and connections. A barrier at the moment is that extensions are written in C++, though this limitation should be removed soon.I've been building a lot on top of DuckDB, two of the projects I'm working on are linked in the article:- Evidence (<a href="https://evidence.dev">https://evidence.dev</a>): Build data apps with SQL + Markdown- DuckDB GSheets (<a href="https://duckdb-gsheets.com" rel="nofollow">https://duckdb-gsheets.com</a>): Read/Write Google Sheets via DuckDB

评论 #42332420 未加载

评论 #42331547 未加载

评论 #42335740 未加载

评论 #42332405 未加载

评论 #42332157 未加载

评论 #42335738 未加载

Jtsummers5 months ago

<a href="https://pragprog.com/titles/pwrdata/seven-databases-in-seven-weeks-second-edition/" rel="nofollow">https://pragprog.com/titles/pwrdata/seven-databases-in-seven...</a> - A book by the same name. Instead of giving you a brief blurb on each database, the authors attempt to give you more context and exercises with them. Last updated in 2018 it covers PostgreSQL, HBase, MongoDB, CouchDB, Neo4J, DynamoDB, and Redis. The first edition covered Riak instead of DynamDB.

评论 #42330639 未加载

评论 #42335813 未加载

breadwinner5 months ago

ClickHouse is awesome, but there's a newer OLAP database in town: Apache Pinot, and it is significantly better: <a href="https://pinot.apache.org/" rel="nofollow">https://pinot.apache.org/</a>Here's why it is better:1. User-facing analytics vs. business analytics. Pinot was designed for user-facing analytics (meaning analytics result is used by end-user (for example, "what is the expected delivery time for this restaurant?"). The demands are much higher, including latency, freshness, concurrency and uptime.2. Better architecture. To scale out ClickHouse uses sharding. Which means if you want to add a node you have to bring down the database, re-partition the database and reload the data, then bring it back up. Expect downtime of 1 or 2 days at least. Pinot on the other hand uses segments, which is smaller (but self-contained) pieces of data, and there are lots of segments on each node. When you add a node, Pinot just moves around segments, no downtime needed. Furthermore, for high availability ClickHouse uses replicas. Each shard needs 1 or 2 replicas for HA. Pinot does not have shards vs replica nodes. Instead each segment is replicated to 2 to 3 nodes. This is better for hardware utilization.3. Pre-aggregation. OLAP cubes became popular in the 1990s. They pre-aggregate data to make queries significantly faster, but the downside is high storage cost. ClickHouse doesn't have the equivalent of OLAP cubes at all. Pinot has something better than OLAP cubes: Star trees. Like cubes, star trees pre-aggregate data along multiple dimensions, but don't need as much storage.

评论 #42336614 未加载

评论 #42336374 未加载

评论 #42337449 未加载

评论 #42337648 未加载

评论 #42339095 未加载

mble_5 months ago

Author here.Thanks for sharing! My choices are pretty coloured by personal experience, and I didn't want to re-tread anything from the book (Redis/Valkey, Neo4j etc) other than Postgres - mostly due to Postgres changing _a lot_ over the years.I had considered an OSS Dynamo-like (Cassandra, ScyllaDB, kinda), or a Calvin-like (FaunaDB), but went with FoundationDB instead because to me, that was much more interesting.After a decade of running DBaaS at massive scale, I'm also pretty biased towards easy-to-run.

评论 #42331927 未加载

评论 #42333717 未加载

评论 #42332676 未加载

pradeepchhetri5 months ago

> If I had to only pick two databases to deal with, I’d be quite happy with just Postgres and ClickHouse - the former for OLTP, the latter for OLAP.As the author mentioned, I completely agree with this statement. In fact, many companies like Cloudflare are built with exactly this approach and it has scaled them pretty well without the need of any third database.> Another reason I suggest checking out ClickHouse is that it is a joy to operate - deployment, scaling, backups and so on are well documented - even down to setting the right CPU governor is covered.Another point mentioned by author which is worth highlighting is the ease of deployment. Most distributed databases aren't so easy to run at scale, ClickHouse is much much easier and it has become even more easier with efficient storage-compute separation.

评论 #42336136 未加载

评论 #42336206 未加载

rho45 months ago

The article mentions the TigerBeetle Style Guide: <a href="https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md">https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TI...</a>I agree so much with the paragraphs about "Dependencies" and "Tooling".

atonse5 months ago

I didn't realize this [1] was a thing. I've been informally referring to our Postgres/Elixir stack as "boring, but in the best way possible, it just works with no drama whatsoever" for years.1: <a href="https://boringtechnology.club" rel="nofollow">https://boringtechnology.club</a>

maximus935 months ago

DuckDB really seems to be having its moment—projects like Evidence and DuckDB GSheets are super cool examples of its potential. And yeah, Postgres’s longevity is insane, it just keeps adapting.On the AI front, vector databases like Pinecone and pgvector are exciting, but I’d love to see something even more integrated with AI workflows. The possibilities are huge. Curious to hear what others think!

评论 #42331201 未加载

评论 #42336603 未加载

评论 #42338246 未加载

评论 #42334547 未加载

评论 #42336607 未加载

dangoodmanUT5 months ago

FDB is wildly underrated. Best DB atm

评论 #42331446 未加载

评论 #42331753 未加载

wb141235 months ago

Ever since CockroachDB changed their license, I'm searching for alternatives. PostgreSQL is an obvious choice but is there a good HA solution? What people usually do for HA with PostgreSQL or do they just not care about it? I tested Patroni, which is the most popular one in my knowledge, but found some HA issues that makes me hesitate to use: <a href="https://www.binwang.me/2024-12-02-PostgreSQL-High-Availability-Solutions-Part-1.html" rel="nofollow">https://www.binwang.me/2024-12-02-PostgreSQL-High-Availabili...</a>

评论 #42338083 未加载

pjmlp5 months ago

For me the pinnacle of boring RDMS are Oracle and SQL Server.

评论 #42330550 未加载

评论 #42330661 未加载

评论 #42330541 未加载

sdesol5 months ago

For those not familiar with DuckDB, it's an amazing database, but it is not a replacement for SQLite, if you are looking for a lightweight server side DB. I'm in love with the DuckDB client and use it to query SQLite databases, but due the fact that it only supports one concurrent write connection, it is not suitable as a server side DB.

petesergeant5 months ago

Original 7 were: Redis, Neo4J, CouchDB, MongoDB, HBase, Postgres, and DynamoDB

评论 #42330980 未加载

ako5 months ago

In this age of AI, I’m missing a database with a focus on AI, e.g, something like a vector database, or maybe something even better…

评论 #42330888 未加载

评论 #42330720 未加载

评论 #42330648 未加载

joeevans10005 months ago

No bitemporal db (i.e. xtdb)?

评论 #42338093 未加载

deadbabe5 months ago

On the first day of Christmas my true love sent to me, a brand new Postgres query

znpy5 months ago

> CockroachDBdidn't it like go closes-source a while ago?

评论 #42331808 未加载

kussenverboten5 months ago

Shouldn't Qdrant be on that list in 2025?

评论 #42336625 未加载

AzzieElbab5 months ago

no one uses or writes new graph databases. sad

评论 #42337218 未加载

amazingamazing5 months ago

Why duck db vs chdb?

评论 #42331514 未加载

评论 #42330984 未加载

littlekey5 months ago

I'm just gonna say it: unless I had a specific reason to use it, I would cross CockroachDB off my list purely based on the name. I don't want to be thinking of cockroaches every time I use my database. Names do have meaning, and I have to wonder why they went with that one.

评论 #42332673 未加载

评论 #42333856 未加载