We're looking at moving a large (~3TB and growing at an increasing rate) postgres database over to Citus. I'm sold, I think but for a big decision like this I think it makes sense to get some references.<p>Have you had experience using Citus? What are the pros and cons that you've observed?
I am not a user of Citus Cloud, but I am an Enterprise customer that runs Citus in self-hosted installations and can offer my opinion on the technology and company itself.<p>If you are looking to horizontally expand your Postgres database, I would say you've come to the right place. At my place of work, we ingest TBs of data every day using Citus. Our primary use case for Citus is analytics and MPP-style queries; we do not run multi-tenant setups, but it is also a great solution for such a use case.<p>Pros:
* Support is great.
* Easy-to-understand architecture. The system's metadata is managed in the open in Postgres tables; it's not a black box.
* Performant.
* Enforces you to think through your data model and optimize towards a distributed environment.
* Works great with the existing Postgres extension ecosystem, and especially with the Citus supported extensions cstore_fdw, hll, and topn.
* If you need flexible architecture, the primitives are available to do what you need to do. This may not apply to Citus Cloud which appears to be your interest and is far more turn-key in nature than self-hosting.<p>Cons:
* (Enterprise only) Built-in shard replication is reliable for data redundancy but not for high-availability scenarios. Citus Cloud has great high-availability though and does it using Postgres streaming replication. They may be working on making the shard replication better, I don't know. But I would just stay away from it.
* The hash distribution model only supports a single distribution column. You are required to handle in custom functions or at the application layer a model where you construct a combination key of multiple columns' values into a single distribution column, if you truly need to distribute based on multiple keys. E.g. by concatenating two unique values.<p>I could probably think of more pros and cons for sure. Is there anything in particular you're curious about for your use case? What type of model are you pursuing (multi-tenant, MPP, etc.)? Are you heavy read, heavy write, or both?