So I thought it would be kind of cool to take Pgbouncer to the next level. I've been using it with large database clusters for a while and it's been spectacular, but it does have some limitations that always seemed arbitrary.<p>This is my take on what a modern Postgres pooler can be. Besides supporting same session and transaction pooling modes as Pgbouncer, it also adds support for load balancing between replicas (round robin at the moment), failover in case a replica fails a health check, and the coolest thing yet I think: sharding at the pooler level.<p>It's written in Rust, which I think makes it much easier to iterate on and improve.<p>Note in case it wasn't obvious: this is super experimental, please don't use this in your production environment unless you're adventurous. In which case, please do and let me know how it goes!<p>- Lev
As a fan of Rust, this seems like the perfect application domain for "fearless concurrency". Having gone through the bowels of a few different connection pool implementations in various languages (because they had bugs in production), I can tell you the concurrency management becomes extremely difficult to reason about as the feature complexity increases. I think you're making a good bet that offloading a good chunk of that to the compiler will improve your ability to iterate in the future.<p>GoLang seems like the language du jour, but I think this post [1] really illustrates how powerful it can be when you have a compiler that handles so much more for you. It's not just about performance. Deadlock detection and a robust compile time set of error checks are huge wins over looser languages.<p>1: <a href="https://fasterthanli.me/articles/some-mistakes-rust-doesnt-catch" rel="nofollow">https://fasterthanli.me/articles/some-mistakes-rust-doesnt-c...</a>
This is really interesting and exciting to see. Though it probably has a ways to go before being production ready. We've seen to date a few other efforts at connection poolers and balancers in Postgres and every time come back to pgbouncer being the one that works best. While it has its issues, pgbouncer does indeed work. Others we've explored tend to break in unclear ways.<p>That said some of the things this one looks to solve are indeed very interesting.<p>I've seen issues at times with pgbouncer being single threaded. Since it's only running against a single core it is possible to peg that core and have a max throughput. While not common we have seen this for customers before, both at Crunchy Data[1] on and back at Citus.<p>The other interesting piece is sharding. We considered this approach for years at Citus and never managed to pull it off. If you can specify a identifier and it can route accordingly it can give you a lightweight option for sharding really quite easily.<p>Again, probably has a ways to go, everything we've tried over the last 10 years has failed to live up to a replacement for pgbouncer. But pgcat is attempting to do I'm fully supportive of if it can accomplish it.<p>[1] On Crunchy Bridge www.crunchybridge.com we have pgbouncer built-in, with multiple pgbouncer support to alleviate single threaded issue.
1. rename the project, there's already pgcat.
2. bring it to production level, i would love it :D
3. Are there things like gis or jsonb which would break with it, or does it just work?
I'm really impressed by how lean this code base is. Is PgBouncer also this lean? Looks like auth is the only thing left here then, aside from maybe some more testing? Very cool to see!
This is so awesome.<p>For those wondering why a pooler is needed, Tl;DR containers.
It's much more normal to have hundreds of containers running an application configured to have tens of connections. Now you have thousands.
Each connection costs postgres something on the order of ~2.5MB ram, so once you get into the thousands you're starting to talk real numbers = ~2.5GB<p>Problems I've had with pgbouncer in the past:<p>- stats. Any modern tool should natively emit its own statistics in something like statsd format.
- pgbouncer is a single process - I can't throw more CPU cores at it to make it faster.<p>Problems that I'm still struggling to solve...<p>We use RDS/Aurora which uses DNS to fail over. Yes applications should handle this. Yes they should retry. Yes devs should have tested this. But in any large company this is a uphill battle involving many teams and apps. Much easier to introduce a proxy layer to handle fast failover. Even better if it can re-try queries that fail transparently to the application.