A long way to go:<p><a href="https://www.monetdb.org/content/citusdb-postgresql-column-store-vs-monetdb-tpc-h-shootout" rel="nofollow">https://www.monetdb.org/content/citusdb-postgresql-column-st...</a>
If you are not familiar with Foreign Data Wrappers, they allow you to connect to other datastores and represent that data as tables in your database. <a href="http://wiki.postgresql.org/wiki/Foreign_data_wrappers" rel="nofollow">http://wiki.postgresql.org/wiki/Foreign_data_wrappers</a>
Does this support JOINs? Or do you use a giant WHERE IN () clause?<p>My use case is essentially a cross-database JOIN that I've been using MySQL & temp tables to accomplish. For example, give me the sum of column x if column y is any one of these 50,000 values from a separate system. So load the 50,000 values into a temp table and then do a JOIN. Performance isn't that great and it uses a ton of disk space so I wanted to try using a columnar store.
I'm very excited about this! Add a mechanism to distribute data and queries across a cluster, and this could be the makings of an open-source Amazon Redshift.
It would be interesting to compare these benchmarks against the performance of Amazon's Redshift.<p>If the benchmark can be run without changes on Redshift would be my first question. There are some interesting differences that Redshift has rather than just being a columnar PostgreSQL protocol-speaking database. But if its possible, I'd be very interested to see the results.
Do the benchmarks for postgres utilize in memory columar store (IMCS)? What is the difference between postgres imcs and citus cstore_fdw? <a href="http://www.postgresql.org/message-id/52C59858.9090500@garret.ru" rel="nofollow">http://www.postgresql.org/message-id/52C59858.9090500@garret...</a>
Isn't the assumed tradeoff SSD storage for CPU usage? How much more cpu time is utilized in compressing/decompressing? And whats the unit cost of that extra CPU in comparison to the cost for disk space savings of 'expensive' SSD's?
I couldn't find documentation about what subset of SQL you can use. I saw mention of "all supported Postgres data types", but not anything about what features work. Any links?