TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Databases in 2021: A Year in Review

323 点作者 jameslao超过 3 年前

31 条评论

why-el超过 3 年前
Postgres&#x27;s dominance is well deserved, of course. My only concerns with it, both are actively worked on, are bloat management (significant for update heavy workloads and programmers used to the MySQL model of rollback segments) and the scaling of concurrency (going over 500 connections). Bloat was taken over by Cybertec[1] after stalling for a bit and is funded (yay), while concurrency was also enhanced out of Microsoft [2]. All in all, an excellent future for our beloved Postgres.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;cybertec-postgresql&#x2F;zheap" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;cybertec-postgresql&#x2F;zheap</a> [2] <a href="https:&#x2F;&#x2F;techcommunity.microsoft.com&#x2F;t5&#x2F;azure-database-for-postgresql&#x2F;improving-postgres-connection-scalability-snapshots&#x2F;ba-p&#x2F;1806462#conclusion-one-bottleneck-down-in-pg-14-others-in-sight" rel="nofollow">https:&#x2F;&#x2F;techcommunity.microsoft.com&#x2F;t5&#x2F;azure-database-for-po...</a>
评论 #29733714 未加载
评论 #29777124 未加载
评论 #29732936 未加载
评论 #29734454 未加载
评论 #29733501 未加载
zffr超过 3 年前
The author is a professor at CMU who specializes in databases: <a href="https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~pavlo&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~pavlo&#x2F;</a><p>Not completely related, but his lectures on databases on YouTube are really good. Much better than the DB class I had at college.
评论 #29736501 未加载
评论 #29736032 未加载
thejosh超过 3 年前
I&#x27;m really excited by all the database love in the last few years. I moved to PG from MySQL in 2014 and don&#x27;t regret it since.<p>Timescaledb looks very exciting, as it&#x27;s &quot;just&quot; a PG extension, but their compression work looks great. [0]<p>I&#x27;m also really loving clickhouse, but haven&#x27;t deployed that to production yet (haven&#x27;t had the need to yet, almost did for an apache arrow reading thing, but didn&#x27;t end up using arrow). They do some amazing things there, and the work they do is crazy impressive and fast. Reading their changelog they power through things.<p>[0] <a href="https:&#x2F;&#x2F;docs.timescale.com&#x2F;timescaledb&#x2F;latest&#x2F;how-to-guides&#x2F;compression&#x2F;" rel="nofollow">https:&#x2F;&#x2F;docs.timescale.com&#x2F;timescaledb&#x2F;latest&#x2F;how-to-guides&#x2F;...</a>
threeseed超过 3 年前
So a company that sells PostgreSQL services thinks PostgreSQL is dominating. Brilliant.<p>The reality is that nothing is dominating. In 2021 there were more databases than ever each addressing a different use case. Companies don&#x27;t have just one EDW they will have dozens even hundreds of siloed data stores. Startups will start with one for everything, then split out auth, user analytics, telemetry etc<p>There is no evidence of any consolidation in the market. And definitely not some mass trend towards PostgreSQL.
评论 #29732969 未加载
dreyfan超过 3 年前
All you need is Postgres (OLTP) and if you have large datasets where Postgres falls behind for analytical work, then you reach for Clickhouse (OLAP) for those features (while Postgres remains your primary operational database and source of truth).
评论 #29733133 未加载
评论 #29745969 未加载
czhu12超过 3 年前
It&#x27;s weird to put postgres into the same bucket as elastic search as they are often used for different things.<p>No matter how much you tune &#x2F; denormalize postgres, you&#x27;ll never get the free text search performance elastic search offers. Our best efforts on a 5 million row table yielded 600ms query times vs 30-60ms.<p>Similarity with snow flake, you&#x27;d never expect postgres to perform analytical queries at that scale.<p>I know graph databases and Time series DB have similar performance tradeoffs.<p>I think the most interesting and challenging area is how to architect a system uses many of these databases and keeps them eventually consistent without some bound.
评论 #29735758 未加载
评论 #29735232 未加载
评论 #29745928 未加载
评论 #29734061 未加载
jorangreef超过 3 年前
What are the distributed options for Postgres? What mechanisms are available to make it highly available i.e. with a distributed consensus protocol for strict serializability when failing over the primary? How do people typically deploy Postgres as a cluster?<p>1. Async replication tolerating data loss from slightly stale backup after a failover?<p>2. Sync replication tolerating downtime during manual failover?<p>3. Distributed consensus protocol for automated failover, high availability and no data loss, e.g. Viewstamped Replication, Paxos or Raft?<p>It seems like most managed service versions of databases such as Aurora, Timescale etc. are all doing option 3, but the open-source alternatives otherwise are still options 1 and 2?
评论 #29737589 未加载
评论 #29740469 未加载
eternalban超过 3 年前
Databases are the best all around scratch every cs geek itch domain there is, with possible exception of operating systems.<p>The critical importance of <i>extensibility</i> as a primary concern of successful DB products needs to be highlighted. Realities of the domain dictate that product X matures a few years after inception, at which point the application patterns may have shifted. (Remember map-reduce?) If you pay attention, for example, you&#x27;ll note that the du jour darlings are scrambling to claim fitness for ML (a subset of big-data), and the new comers are claiming to be &quot;designed for ML&quot;.<p>Smart VC money should be on extensible players ..
SPBS超过 3 年前
I genuinely couldn&#x27;t tell if the author was being sarcastic when he said Larry Ellison was down on his luck because he dropped from 5th richest to 10th richest (and the whole thing about pulling himself out of the gutters by clawing up to 5th richest again).
评论 #29737279 未加载
评论 #29734812 未加载
评论 #29734598 未加载
评论 #29734205 未加载
sriku超过 3 年前
I&#x27;ve been intrigued by dgraph (<a href="https:&#x2F;&#x2F;dgraph.io" rel="nofollow">https:&#x2F;&#x2F;dgraph.io</a>) and used it to good effect in a (toy) project where it felt easy to create and evolve it&#x27;s data model given changing requirements.<p>Dgraph uses graphql as its native query language.<p>Anyone here has some experience to share on it? ... Since it isn&#x27;t mentioned in the article.
divan超过 3 年前
My DB discovery and the game changer of 2021 was EdgeDB.
评论 #29734524 未加载
评论 #29735213 未加载
评论 #29733006 未加载
uvdn7超过 3 年前
I agree with Andy that it’s just super fun to work on databases. You get to work on consensus, networking, compute, storage, etc. The workloads are always changing, you can try to optimize across the entire stack. Applications and workloads come and go, but databases will always be around.
评论 #29745001 未加载
tayo42超过 3 年前
Wow I kind of feel like I&#x27;m reading about Javascript frameworks. I don&#x27;t recognize any of the dbs or companies&#x2F;projects. Didn&#x27;t realize the db world was so busy
评论 #29737284 未加载
评论 #29734265 未加载
ttiurani超过 3 年前
&gt; Databases Are the Most Important Thing in My Life After My Family &gt; I even broke up with a girlfriend once because of sloppy benchmark results.<p>I can&#x27;t say I can relate, but I do appreciate being this passionate about things!
评论 #29737459 未加载
ransom1538超过 3 年前
I am so confused. <a href="https:&#x2F;&#x2F;vitess.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;vitess.io&#x2F;</a> I would check this page out and view it&#x27;s &quot;Who uses Vitess&quot; section. Postgres is awesome if you are running a stand alone server with 300 users or creating the next &quot;uber for cats&quot;. But at scale mysql has all the solutions. DBs are not js frameworks.
bsdnoob超过 3 年前
I think PostgreSQL in an excellent general purpose solution specially for OLTP usecases but what it lacks behinds is that it&#x27;s hard to scale horizontally (sharding). There are solutions for this ofcourse with citus but I haven&#x27;t experimented with it however I have tried MySQL with Vitess which almost seems like dark wizardry. I hope one day vitess works with PostgreSQL.
FridgeSeal超过 3 年前
From the article:<p>&gt; Rockset joined in, saying its performance is was better for real-time analytics than the other two.<p>So I went and read the linked Rockset comparison blog post, and while I get that it’s a marketing piece, it’s also so transparently desperate for <i>any</i> advantage over Druid and ClickHouse that their criteria is bizarre at best, and bordering on wildly incorrect at worst.<p>I’ve been burnt by commercial databases before, and I have a hard time justifying ever using one, especially considering the advent of open source databases that have feature and performance parity (if not outright superiority) and can be self-hosted on K8s, or managed-hosting can be easily purchased.
评论 #29733128 未加载
hu3超过 3 年前
I expected more mentions of Vitess, which honestly looks like some kind of alien black magic from what I saw while consulting for a client this year.<p>But I guess not much else happened to it other than PlanetScale.
评论 #29733795 未加载
评论 #29734888 未加载
评论 #29735500 未加载
kaliszad超过 3 年前
Does somebody have experience with XTDB <a href="https:&#x2F;&#x2F;xtdb.com&#x2F;index.html" rel="nofollow">https:&#x2F;&#x2F;xtdb.com&#x2F;index.html</a> ? We would like to use it in our Clojure application perhaps with PostgreSQL as the backend (JDBC) to make it easier to implement a history feature.<p>Looking forward, instead of backward, it would be great for databases to have some kind of live-patch&#x2F; live-update feature so that one does not need any downtime at all if some rules are obeyed (with an automatic check, if that is the case). The same is for operating systems, where we have parts of the technology and even some limited deployment, but nothing of it is the default as far as I know. This situation makes it quite a bit harder to develop and maintain systems without introducing extreme complexity. It does not look like we will have less bugs&#x2F; less patches any time soon so we should make updating as easy as possible to drastically reduce the need for a maintenance window without resorting to building clusters for everything.
hbarka超过 3 年前
I’m genuinely happy with Redshift for data warehousing purposes. For this I mean not-transactional data store. I don’t want to use the term OLTP or OLAP as it puts it in a purist’s camp. Sometimes I store 3NF normalized data and many times a flattened denormalized very large fact table and often times a model similar to star schema. I don’t have to worry about building indexes anymore, which was a real chore with row-store databases like Oracle, MySQL, SQL Server, or PostgreSql. MPP column-store databases have really been a game-changer for the enterprise. We’re talking billions of rows of data easily handled in the query plan.
评论 #29747527 未加载
评论 #29745145 未加载
leetrout超过 3 年前
Excited to see Dgraph on the top 10 mentions and climbing above neo4j
评论 #29734418 未加载
criticaltinker超过 3 年前
Databases in 2030: <i>SQL</i> DB finally succumbs to <i>Graph</i> DB as #1<p>Does anyone else feel like a caveman when modeling a many to many relationship in a normalized schema, and then querying via SQL?<p>I’m surprised graph DBs aren’t more popular for this reason alone. Maybe it’s a far fetched dream, but perhaps a graph frontend can be slapped onto the Postgres backend.
评论 #29737326 未加载
评论 #29737874 未加载
评论 #29748984 未加载
评论 #29734578 未加载
评论 #29734056 未加载
评论 #29736724 未加载
评论 #29734584 未加载
rapnie超过 3 年前
Nice collection of open source databases: <a href="https:&#x2F;&#x2F;codeberg.org&#x2F;yarmo&#x2F;delightful-databases" rel="nofollow">https:&#x2F;&#x2F;codeberg.org&#x2F;yarmo&#x2F;delightful-databases</a>
评论 #29736466 未加载
jimmyed超过 3 年前
Andy forgot about the ugliest spart around benchmarks: Yugabyte v Cockroach.
endisneigh超过 3 年前
I wish there was some API that abstracted the DB and all technical details and you could connect nodes to it that are specific databases with specific capabilities and it would delegate as necessary.
评论 #29735000 未加载
评论 #29734563 未加载
评论 #29734944 未加载
评论 #29734184 未加载
评论 #29734768 未加载
beamatronic超过 3 年前
Not a word about Couchbase, which went IPO and is currently worth $1B
评论 #29734724 未加载
peakaboo超过 3 年前
And no mention of Exasol which is faster than most, if not all, of these databases for analytics.
dblooman超过 3 年前
ELI5, why do people still choose to use mongo?
评论 #29747925 未加载
评论 #29747917 未加载
评论 #29746042 未加载
评论 #29737454 未加载
cloudengineer94超过 3 年前
Postgress is amazing, however I work with SAP HANA every day and I gotta say this thing is completely insane.
评论 #29735240 未加载
评论 #29736264 未加载
closeparen超过 3 年前
The world moved away from Hadoop and MapReduce… onto what?
评论 #29733437 未加载
评论 #29732880 未加载
throwDec21超过 3 年前
I&#x27;m just surprised that in 2021 BigQuery isn&#x27;t more popular. I thought it would be top 10 by now, I moved to GCP because of it but feel like I&#x27;m the only one.
评论 #29737841 未加载
评论 #29742767 未加载
评论 #29736255 未加载