Another similar option is cstore_fdw [0] -- it's now part of Citus but can still be used standalone as a foreign data wrapper. We use it at my startup to do OLAP on Postgres. It has some advantages on parquet_fdw:<p>* Supports writes (actually generating Parquet files was also difficult in my testing: I used odbc2parquet [1] but thanks for the ogr2ogr tip!) so you can write directly to the foreign table by running INSERT INTO ... SELECT FROM ...<p>* Supports all PG datatypes (including types from extensions, like PostGIS)<p>* Performance was basically comparable in my limited testing (faster for a single-row SELECT with cstore_fdw in our case since we do partition pruning, same for a full-table scan-and-aggregation).<p>Re: performance overhead, with FDWs we have to re-munge the data into PostgreSQL's internal row-oriented TupleSlot format again. Postgres also doesn't run aggregations that can take advantage of the columnar format (e.g. CPU vectorization). Citus had some experimental code to get that working [2], but that was before FDWs supported aggregation pushdown. Nowadays it might be possible to basically have an FDW that hooks into the GROUP BY execution and runs a faster version of the aggregation that's optimized for columnar storage. We have a blog post series [3] about how we added agg pushdown support to Multicorn -- similar idea.<p>There's also DuckDB which obliterates both of these options when it comes to performance. In my (again limited, not very scientific) benchmarking of on a customer's 3M row table [4] (278MB in cstore_fdw, 140MB in Parquet), I see a 10-20x (1/2s -> 0.1/0.2s) speedup on some basic aggregation queries when querying a Parquet file with DuckDB as opposed to using cstore_fdw/parquet_fdw.<p>I think the dream is being able to use DuckDB from within a FDW as an OLAP query engine for PostgreSQL. duckdb_fdw [5] exists, but it basically took sqlite_fdw and connected it to DuckDB's SQLite interface, which means that a lot of operations get lost in translation and aren't pushed down to DuckDB, so it's not much better than plain parquet_fdw. I had a complex query in the PG dialect generated with dbt that used joins, CTEs and window functions. I don't remember the exact timings, but it was even slower on duckdb_fdw than with cstore_fdw, whereas I could take the same query and run it on DuckDB verbatim, only replacing the foreign table name with the Parquet filename.<p>This comment is already getting too long, but FDWs can indeed participate in partitions! There's this blog post that I keep meaning to implement where the setup is, a "coordinator" PG instance has a partitioned table, where each partition is a postgres_fdw foreign table that proxies to a "data" PG instance. The "coordinator" node doesn't store any data and only gathers execution results from the "data" nodes. In the article, the "data" nodes store plain old PG tables, but I don't think there's anything preventing them from being parquet_fdw/cstore_fdw tables instead.<p>[0] <a href="https://github.com/citusdata/cstore_fdw" rel="nofollow">https://github.com/citusdata/cstore_fdw</a><p>[1] <a href="https://github.com/pacman82/odbc2parquet" rel="nofollow">https://github.com/pacman82/odbc2parquet</a><p>[2] <a href="https://github.com/citusdata/postgres_vectorization_test" rel="nofollow">https://github.com/citusdata/postgres_vectorization_test</a><p>[3] <a href="https://www.splitgraph.com/blog/postgresql-fdw-aggregation-pushdown-multicorn-part-1" rel="nofollow">https://www.splitgraph.com/blog/postgresql-fdw-aggregation-p...</a><p>[4] <a href="https://www.splitgraph.com/trase/supply-chains" rel="nofollow">https://www.splitgraph.com/trase/supply-chains</a><p>[5] <a href="https://github.com/alitrack/duckdb_fdw" rel="nofollow">https://github.com/alitrack/duckdb_fdw</a><p>[6] <a href="https://swarm64.com/post/scaling-elastic-postgres-cluster/" rel="nofollow">https://swarm64.com/post/scaling-elastic-postgres-cluster/</a>