Our product has some reporting features that require aggregations / analytics functionality.
Some of the analytics workloads are time series, others are not and we generally expect these analytics queries to resolve in ~2.5s<p>We've recently decided to move these workloads to snowflake because we want to protect our transactional workloads.<p>The snowflake devex has been pretty bad because we'd need a snowflake "instance" for each dev's postgres localhost, and we like that localhost postgres to be ephemeral. Additionally, it'd be nice to have this work all locally.<p>One interesting piece of software I came across is DuckDB. It's lightweight. There's no additional storage needed. It's an interesting direction for me to test but I don't know if it'll satisfy our latency requirements.<p>How have you separated and scaled out your analytics workloads from postgres?
It depends on the scale - If you expect to scale to at least a few hundred GBs a (to multiple TBs), ClickHouse would be a safe and proven option. The Postgres + ClickHouse duo is a common pattern adopted by production-grade customers as their default data stack to solve transactional (OLTP) and analytical (OLAP) use cases.<p>ClickHouse also recently released Postgres CDC connector in ClickPipes for a seamless integration of Postgres. Now you can stream Postgres data into ClickHouse within a few minutes. <a href="https://clickhouse.com/cloud/clickpipes/postgres-cdc-connector" rel="nofollow">https://clickhouse.com/cloud/clickpipes/postgres-cdc-connect...</a> This was a result of the acquisition of PeerDB, an open-source Postgres CDC product - <a href="https://clickhouse.com/blog/clickhouse-welcomes-peerdb-adding-the-fastest-postgres-cdc-to-the-fastest-olap-database" rel="nofollow">https://clickhouse.com/blog/clickhouse-welcomes-peerdb-addin...</a><p>Disclaimer: This is Sai from ClickHouse/PeerDB here. However, the answer above is based on multiple years of customer experiences. :)
DuckDB should give you everything with regards to DevEx that you seem to be needing.<p>You're not specifying what volume we're talking about here, but I'd be surprised if you can't go a long way with DuckDB. Ultimately the bottleneck for latency in a non-big data scenario will be memory?<p>You can run it in a browser: <a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a><p>DuckDB's docs and blog are full of great content with regards to optimizing:<p>How to tune workloads: <a href="https://duckdb.org/docs/guides/performancehow_to_tune_workloads.html" rel="nofollow">https://duckdb.org/docs/guides/performancehow_to_tune_worklo...</a><p>Performance guide: <a href="https://duckdb.org/docs/guides/performance/overview.html" rel="nofollow">https://duckdb.org/docs/guides/performance/overview.html</a><p>Optimizers in DuckDB: <a href="https://duckdb.org/2024/11/14/optimizers.html" rel="nofollow">https://duckdb.org/2024/11/14/optimizers.html</a><p>I'm sure there is more.<p>If I were in your shoes, even if you don't settle on DuckDB, it would be my go to benchmark due to ease of use and unoptimized speed.<p>Anything else out there like chDB (Clickhouse) or Apache Datafusion should be worth the effort.<p>For a detailed benchmark of DuckDB vs. Clickhouse for an orientation, see: <a href="https://bicortex.com/duckdb-vs-clickhouse-performance-comparison-for-structured-data-serialization-and-in-memory-tpc-ds-queries-execution/" rel="nofollow">https://bicortex.com/duckdb-vs-clickhouse-performance-compar...</a>