TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Migrating to Postgres

240 pointsby shenli35144 days ago

24 comments

luhn4 days ago
&gt; By Jan 2024, our largest table had roughly 100 million rows.<p>I did a double take at this. At the onset of the article, the fact they&#x27;re using a distributed database and the mention of a &quot;mid 6 figure&quot; DB bill made me assume they have some obscenely large database that&#x27;s far beyond what a single node could do. They don&#x27;t detail the Postgres setup that replaced it, so I assume it&#x27;s a pretty standard single primary and a 100 million row table is well within the abilities of that—I have a 150 million row table happily plugging along on a 2vCPU+16GB instance. Apples and oranges, perhaps, but people shouldn&#x27;t underestimate what a single modern server can do.
评论 #43991154 未加载
评论 #43993422 未加载
评论 #43990995 未加载
评论 #43991223 未加载
评论 #43994644 未加载
评论 #43990959 未加载
评论 #43993181 未加载
评论 #43992324 未加载
评论 #43993637 未加载
评论 #43996726 未加载
评论 #43993495 未加载
评论 #43998043 未加载
评论 #43990917 未加载
评论 #43997689 未加载
esafak4 days ago
I read it as: Why You Shouldn&#x27;t Use Prisma and How Cockroach Hung Us Out To Dry<p>I already knew about prisma from the infamous <a href="https:&#x2F;&#x2F;github.com&#x2F;prisma&#x2F;prisma&#x2F;discussions&#x2F;19748">https:&#x2F;&#x2F;github.com&#x2F;prisma&#x2F;prisma&#x2F;discussions&#x2F;19748</a>
评论 #43991706 未加载
评论 #43993580 未加载
评论 #43990682 未加载
评论 #43990564 未加载
评论 #43990764 未加载
评论 #43990604 未加载
评论 #43990909 未加载
sreekanth8504 days ago
It&#x27;s wild and hilarious, how often startups and companies go for distributed databases like CockroachDB&#x2F;TiDB&#x2F;Yugabyte before they actually need distribution, this trends sucks. 100 million rows is nothing that a well-tuned Postgres or MySQL instance (or even read-replicated setup) can&#x27;t handle comfortably. Scale when you hit the wall.
评论 #43991262 未加载
评论 #43993851 未加载
eftpotrm3 days ago
I can&#x27;t help thinking more startups need greybeards around. (Of which, realistically, I&#x27;m now one.)<p>Largest table 100 million rows and they were paying 6 figures for database services annually? I have one now that sits happily enough on an 8yo laptop. I&#x27;ve worked on systems that had similar scale tables chugging along on very average for 20 years ago MSSQL 2000 boxes. There just isn&#x27;t a need for cloud scale systems and cloud scale bills for that data volume.<p>The problems they&#x27;re describing should never have got that far without an experienced hand pointing out they didn&#x27;t make sense, and if they&#x27;d hired that greybeard they&#x27;d have spotted it long before.
评论 #43993617 未加载
gilbetron3 days ago
The answer to the question, &quot;what database should I use?&quot; is &quot;postgres&quot;. If you are in a situation where postgres actually won&#x27;t work, then you already would know exactly why postgres won&#x27;t work.<p>In other words: [Postgres -&gt; exotic solution] is the path everyone should take (and 99% will just stay in postgres), and not [exotic solution -&gt; postgres].
评论 #43995738 未加载
etler4 days ago
I&#x27;ve lost count of how many &quot;Migrating from X to Postgres&quot; articles I&#x27;ve seen.<p>I don&#x27;t think I&#x27;ve once seen a migrating away from Postgres article.
评论 #43990866 未加载
评论 #43990824 未加载
评论 #43994110 未加载
评论 #43992311 未加载
评论 #43992229 未加载
评论 #43991209 未加载
评论 #43992200 未加载
评论 #43991148 未加载
评论 #43992577 未加载
robertlagrant3 days ago
For all the Prisma-haters: I salute you. But I want to reply to numerous comments with the following:<p>ORMs come in two main types, that I&#x27;m aware of: Active Record (named after the original Ruby one, I think) and Data Mapper (think Hibernate; SQLAlchemy).<p>Active Record ORMs are slightly more ergonomic at the cost of doing loads of work in application memory. Data Mapper looks slightly more like SQL in your code but are much more direct wrappers over things you can do in SQL.<p>Data Mapper also lets you keep various niceties such as generating migration code, that stem from having your table definition as objects.<p>Use Data Mapper ORMs if you want to use an ORM.
评论 #43994600 未加载
评论 #43994078 未加载
optician_owl3 days ago
It&#x27;s an awkward article. To answer why a query is slow you need a bit more details than just the query. Also, I reread about timeouts and didn&#x27;t get it, what was the database, whether it was a database issue, how it was related to migration.<p>The only information I could extract was that the company made bad architectural decisions, believes in ORM (looking at the queries, there are many doubts that the data layouts in DB are adequate) and cannot clearly explain situations. But this is only interesting to their candidates or investors.<p>It may sound rude, so I apologise.
hobs4 days ago
It still makes me sad when half the queries I see are json_* - I know its far too late, but a big sad trombone in query performance is constantly left joining to planner queries that are going to give you 100 rows as an estimate forever.
评论 #43990609 未加载
评论 #43990412 未加载
评论 #43990199 未加载
compton934 days ago
I&#x27;m curious about Motion&#x27;s experience with &quot;Unused Indices&quot;. They suggest Cockroach&#x27;s dashboard listed used indexes in the &quot;Unused Indices&quot; list.<p>I think the indexes they suspect were used are unused but Motion didn&#x27;t realize CockroachDB was doing zigzag joins on other indexes to accomplish the same thing, leaving the indexes that would be obviously used as genuinely not used.<p>It&#x27;s a great feature but CRDB&#x27;s optimizer would prefer a zig zag join over a covering index, getting around this required indexes be written in a way to persuade the optimizer to not plan for a zig zag join.
from-nibly4 days ago
Feels like postgres is always the answer. I mean like there&#x27;s gotta be some edge case somewhere where postgres just can&#x27;t begin to compete with other more specialized database but I&#x27;d think that going from postgres to something else is much easier than the other way around.
评论 #43992361 未加载
评论 #43990814 未加载
评论 #43991455 未加载
评论 #43991372 未加载
评论 #43991028 未加载
moonikakiss4 days ago
great blog. It seems like you might benefit from columnar storage in Postgres for that slow query that took ~20seconds.<p>It&#x27;s interesting that people typically think of columnstores for strict BI &#x2F; analytics. But there are so many App &#x2F; user-facing workloads that actually need it.<p>ps: we&#x27;re working on pg_mooncake v0.2. create a columnstore in Postgres that&#x27;s always consistent with your OLTP tables.<p>It might help for this workload.
评论 #43991025 未加载
评论 #43990627 未加载
sgarland4 days ago
It is forever enraging to me that ORMs turn SELECT * into each individual column, mostly because people then post the whole thing and it’s obnoxiously large.<p>Similarly maddening, the appalling lack of normalization that is simply taken for granted. “It’s faster, bro.” No, no, it is not. Especially not at the hundreds of millions or billions of rows scale. If you store something low-cardinality like a status column, with an average length of perhaps 7 characters, that’s 8 bytes (1 byte overhead assumed, but it could be 2). Multiply that by 2 billion rows, and you’re wasting 16 GB. Disk is cheap, but a. Memory isn’t b. Don’t be lazy. There’s a right way to use an RDBMS, and a wrong way. If you want a KV store, use a damn KV store.<p>Finally, I’d be remiss if I failed to point out that Prisma is an unbelievably immature organization who launched without the ability to do JOINS [0]. They are forever dead to me for that. This isn’t “move fast and break things,” it’s “move fast despite having zero clue what we’re doing but convince JS devs that we do.”<p>[0]: <a href="https:&#x2F;&#x2F;github.com&#x2F;prisma&#x2F;prisma&#x2F;discussions&#x2F;19748">https:&#x2F;&#x2F;github.com&#x2F;prisma&#x2F;prisma&#x2F;discussions&#x2F;19748</a>
评论 #43990398 未加载
评论 #43990447 未加载
评论 #43990535 未加载
评论 #43990379 未加载
评论 #43990428 未加载
评论 #43990445 未加载
frollogaston4 days ago
Did I miss something, or does the article not mention anything about sharding in Postgres? Was that just not needed?<p>Also, query planner maturity is a big deal. It&#x27;s hard to get Spanner to use the indexes you want.
评论 #43990735 未加载
crowcroft3 days ago
Why does Postgres get so much love, and MySQL&#x2F;MariaDB get nothing?<p>I&#x27;m assuming it&#x27;s largely because Postgres has more momentum, and is much more extensible, but if you&#x27;re just trying to do &#x27;boring DB stuff&#x27; I find it&#x27;s faster for most use cases. Development has slowed, but it would be hard to argue that it&#x27;s not battle tested and robust.
评论 #43994830 未加载
评论 #43994618 未加载
评论 #43994620 未加载
评论 #43994801 未加载
Inviz4 days ago
WHERE CONDITION AND 1=1 results in scanning whole table? I dont think so...
CoolCold3 days ago
&gt; First, the support portal is a totally different website that doesn’t share auth with the main portal. Second, you have to re-input a lot of data they already know about you (cluster ID, etc). And by the time they respond it’s typically been a week.<p>I was about to ask what was main constraint for CockroachDB like iostats and atop info for CPU&#x2F;disk drives, but realized that is probably something offloaded to some SaaS - so still curious
mmiao4 days ago
a 100 million rows table is fairly small and you just don&#x27;t need a distributed database. but you will need one if you hit 10 billion rows
评论 #43991468 未加载
评论 #43996168 未加载
coolcase4 days ago
Why not optimise the bad queries first?<p>Aside. Job section says not 9-5. What does that mean? Long hours? Or not 9-5 attitude?
评论 #43990897 未加载
vivzkestrel4 days ago
did you try using the native pg library or postgres or pg-promise library and scrap the ORM completely to see what effect it has? If you are looking explicitly for migrations, you can simply use node-pg-migrate <a href="https:&#x2F;&#x2F;www.npmjs.com&#x2F;package&#x2F;node-pg-migrate" rel="nofollow">https:&#x2F;&#x2F;www.npmjs.com&#x2F;package&#x2F;node-pg-migrate</a> and scrap the rest of all the FLUFF that ORMs come with. ORMs in general are horribly bloated and their performance for anything more than select from table where name = $1 is very questionable
niwtsol4 days ago
That was an interesting read, seemed like an overwhelming amount of data for why they should move off cockroach. All of my db work has been read heavy and I’ve never had a need for super fast multi-region writes. Is a multi-region write architecture possible in Postgres? I’m trying to understand if GDPR was the requirement that resulted in cockroach or if the lackluster multi region write was the bigger driver.
评论 #43990356 未加载
JackSlateur3 days ago
<p><pre><code> there were concerns about the eventuality of a multi-region setup (mandated by GDPR) </code></pre> Can anyone share more details about this ? The GDPR is mandating a multi-region setup ? This sounds very wild
badmonster3 days ago
indeed interesting
liampulles3 days ago
I wonder increasingly with tools like ChatGPT whether ORMs make sense anymore? The argument I&#x27;ve always heard for ORMs is that they make it quick and easy to make the initial repository method, and that they make migrating to a new DB easier (hypothetically).<p>It seems to me LLMs now help fill both those roles, but with the benefit that you can now tune the SQL as needed. I also always prefer actually knowing exactly what queries are going to my DB, but YMMV.