Let's make PostgreSQL multi-threaded

56 点作者 Icathian将近 2 年前

8 条评论

erlkonig将近 2 年前

[I wrote this mostly imagining the idea was about converting the entire Postgresql service to a single monolithic process. I'm not a fan so far. If is actually around coalescing like processes down to a single multithreaded process, that's more reasonable but still comes at a future cost - and whether pointful is still a question]Converting code into multithreaded code tends to make it harder to test and debug FOREVER, as well as being more limited by default for certain system resources than a multi-process solution. Viewing and managing threads from the outside is harder, and killing a rogue thread is much more likely to crash a MT solution than killing a process in a typical resilient MP solution. Above all else, I need a database to be utterly reliable (or as close to it as possible) - including being able to back off in a mature fashion in cases of memory exhaustion (I have overcommit disabled to restore classical memory handling, i.e. malloc() can fail), and file system exhaustion. MT throws a wrench through most of the workings of a complex program, and unless some specific gain can be identified that compensates for adding complexity and fragility to virtually any change going forward, then... um... why? I read a bit "well, the other guys are doing it" handwaving:<pre><code> "Other large projects have gone through this transition. It's not easy, but it's a lot easier now than it was 10 years ago. The platform and compiler support is there now, all libraries have thread-safe interfaces, etc." </code></pre> But that isn't a functional gain. And:<pre><code> "I don't expect you or others to buy into any particular code change at this point, or to contribute time into it. Just to accept that it's a worthwhile goal. If the implementation turns out to be a disaster, then it won't be accepted, of course. But I'm optimistic." </code></pre> But this is NOT a worthwhile goal. Fun, perhaps. Diverting or challenging, perhaps. A disaster, quite possibly. But without identifying a goal that can only be achieved by walking into the multithreading pit, the project is a waste of time for end users. Possibly a growth experience for the experimenters, regardless of whether successful.

评论 #36287121 未加载

评论 #36284902 未加载

评论 #36394019 未加载

rektide将近 2 年前

OT & Apologies for being unspecific, but can anyone else jog my memory about the open source fork 2-3 years ago that was going to try to pretty heavily edit internals/try some of new architectural designs? Some searches have not lead me back to the name of it...Edit: OrioleDB! <a href="https://github.com/orioledb/orioledb">https://github.com/orioledb/orioledb</a> <a href="https://news.ycombinator.com/item?id=30462695">https://news.ycombinator.com/item?id=30462695</a>The 2021 slidedeck on problems & possibilities in postgres was super fun to read! Solving postgresql's wicked problems. <a href="https://www.slideshare.net/AlexanderKorotkov/solving-postgresql-wicked-problems" rel="nofollow noreferrer">https://www.slideshare.net/AlexanderKorotkov/solving-postgre...</a>

评论 #36284694 未加载

kapilvt将近 2 年前

Fwiw I agree with Tom lane, there’s a bunch of extension code out there, that would need to be upgraded and verification around multithreaded safety. It’s a python 2 to 3 transition except assuming correct multi threading in c (most popular lang on extension atm) aka potential disasters for end users.What was missing from the new thread (it’s ref’d in footnotes) is why.. atm the hypothesis is (quoting from linked experiment on multi threading a while back)Direct link which has more discussion,<a href="https://www.postgresql.org/message-id/flat/9defcb14-a918-13fe-4b80-a0b02ff85527@postgrespro.ru" rel="nofollow noreferrer">https://www.postgresql.org/message-id/flat/9defcb14-a918-13f...</a>‘’’What are the advantages of using threads instead of processes?1. No need to use shared memory. So there is no static limit for amount of memory which can be used by Postgres. No need in distributed shared memory and other stuff designed to share memory between backends and bgworkers. 2. Threads significantly simplify implementation of parallel algorithms: interaction and transferring data between threads can be done easily and more efficiently. 3. It is possible to use more efficient/lightweight synchronization primitives. Postgres now mostly relies on its own low level sync.primitives which user-level implementation is using spinlocks and atomics and then fallback to OS semaphores/poll. I am not sure how much gain can we get by replacing this primitives with one optimized for threads. My colleague from Firebird community told me that just replacing processes with threads can obtain 20% increase of performance, but it is just first step and replacing sync. primitive can give much greater advantage. But may be for Postgres with its low level primitives it is not true. 4. Threads are more lightweight entities than processes. Context switch between threads takes less time than between process. And them consume less memory. It is usually possible to spawn more threads than processes. 5. More efficient access to virtual memory. As far as all threads are sharing the same memory space, TLB is used much efficiently in this case. 6. Faster backend startup. Certainly starting backend at each user's request is bad thing in any case. Some kind of connection pooling should be used in any case to provide acceptable performance. But in any case, start of new backend process in postgres causes a lot of page faults which have dramatical impact on performance. And there is no such problem with threads. ‘’’

alberth将近 2 年前

>”For the record, I think this will be a disaster. There is far too much code that will get broken, largely silently, and much of it is not under our control.regards, tom lane”Given that Tom Lane objects to this, I’ve yet to see such a hugely compelling case to go against his advice.<a href="https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi" rel="nofollow noreferrer">https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd...</a>

MuffinFlavored将近 2 年前

How massive of an undertaking is this going to be? There is a mention they think they could get it done in 1-2 releases. <a href="https://pgpedia.info/postgresql-versions/index.html" rel="nofollow noreferrer">https://pgpedia.info/postgresql-versions/index.html</a> they seem to release once a year from what I can deduct. 1-2 years, not bad. I wonder how many people would work on this (full time/part time) and how many resource hours would go into this.I wonder which layer would benefit the most.You connect as pg_client to pg_server, you make a request (SQL query). It has to get tokenized/query planned, then probably some detection layer that already exists that says "do this all in a single thread or spread it across multiple". How does a single threaded process that spawns multiple threads -> a multi-threaded process benefit in these situations given the overhead of having to sync/message in between threads messages back and forth?

评论 #36286929 未加载

Icathian将近 2 年前

Pretty neat pitch by a very prolific PostgreSQL hacker about refactoring from the current multiprocessing architecture to a multi-threaded one. I don't know how it'll play out but personally I think it looks promising.

eyelidlessness将近 2 年前

At a very far distance from any real familiarity with Postgres internals, it’s more than a little bit alarming to me that a RDBMS, which already offers strong concurrency guarantees and which already employs multi-process concurrency to do meaningful work, would even have “disaster” uttered in a discussion of a change to its higher level concurrency mechanics. Maybe that’s very naive and I’m very far out of my depth? Is Postgres concurrency really so close to the metal that it should even matter whether it’s operating between processes, threads, or some other shared workload abstraction?

评论 #36287569 未加载

tonymillion将近 2 年前

Summary:Heikki: let’s make Postgres multithreaded, unless any of you have objectionsMost of the rest of the core team: it would be a disaster or at the very least incredibly difficultHeikki: okay so I’m not hearing any objections so let’s goMost of the rest of the core team: …Heikki (covering ears): lah lah lah