I've never really digged into column-oriented storage, so had a quick skim... Would the below excerpts/example be a fair note of pros/cons of the general idea?<p>> Column-oriented organizations are more efficient when an aggregate needs to be computed over many rows but only for a notably smaller subset of all columns of data, because reading that smaller subset of data can be faster than reading all data.<p>Example: SELECT sum(a) FROM things;<p>> Column-oriented organizations are more efficient when new values of a column are supplied for all rows at once, because that column data can be written efficiently and replace old column data without touching any other columns for the rows.<p>Example: UPDATE things SET a = a+1;<p>> Row-oriented organizations are more efficient when many columns of a single row are required at the same time, and when row-size is relatively small, as the entire row can be retrieved with a single disk seek.<p>Example: SELECT * FROM things;<p>> Row-oriented organizations are more efficient when writing a new row if all of the row data is supplied at the same time, as the entire row can be written with a single disk seek.<p>Example: INSERT INTO things (a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7);
Good work, and I'm impressed by the PostgreSQL foreign data wrapped API power to allow for such ease of implementation.<p>The .proto defined for cstore_fwd differs from the ORC as defined for Hive. At a quick glance I can't find references to dictionary encoding nor statistics, and the datatypes used are apparently the native PostgreSQL data types. From what I can tell this implementation leverages reduced IO (fetch only columns of interest from disk), segment elimination (use min/max info to skip over entire row groups) and pgzl compression for the data stream. I couldn't find references to run-length encoding (RLE) or dictionary encoding. I'm sure the shortcoming will be improved in future iterations, specially better encoding schemes which would result in better compression.<p>But I'm a bit disappointed that the ORC format used is not the same as the one originally used in Hive, Pig and the rest of Java/Hadoop ecosystem. Had it shared the actual file format it would had enabled a number of very interesting scenarios. Think about Hive/PIG serving as ETL pipelines to produce ORC file that are attached directly to PostgreSQL (add today's partition as FOREIGN DATA WRAPPER ... OPTIONS (filename '.../4_4_2014.cstore') and then do interactive analytic driven by PostgreSQL. It would reduce the cost of ingress significantly (fast attach, no need for expensive INSERT and re-compression). I realize data type translation would had been thorny, to say the least (datetime and decimal as primitives, probably many of the structured types too).<p>- github.com/citusdata/cstore_fdw/blob/master/cstore.proto
- github.com/apache/hive/blob/trunk/ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto
I think the emphasis should be on the fact that Postgres now has a free and open source columnar store. For many years (probably decades) there have been companies that have developed "big data"/analytics system with Postgres as the base, but have not contributed back to the Postgres ecosystem.<p>While this new columnar store is not as speedy as ones that have been around since the early 2000s, it does give a platform for CitusData and other companies to build on and share solutions.<p>Having a DB that can hold both the transactional data and data for fast analytical purposes is very advantageous as you have less moving parts and much less ETL work.<p>What I am looking forward to now is a few start ups similar to CitusData that solve different "Big Data" problems and for them to work together to disrupt the multi-billion dollar datawarehouse/analytics vendors.
This is potentially very interesting, it's the only open source SQL columnar store I'm aware of. Bonus points for being part of Postgres.<p>Basically ParAccel (like used by Amazon RedShift) at reasonable cost.
Any particular reason why ORC was chosen as the columnar store format over Parquet (<a href="https://github.com/Parquet/parquet-format" rel="nofollow">https://github.com/Parquet/parquet-format</a>)? Reason I ask is because Parquet seems to have its own development cycle, roadmap, and is pretty continuously updated with enhancements.
I'd be interested to see more benchmarks. The improvements in this post are not anywhere close to what we've seen going from PG to Infobright for our reporting queries - we get speedups from 10x - 1000x, the one speed benchmark they have here is only 2x.
How is this different than the many other columnar SQL databases and extensions?<p>Columnar querying is typical for OLAP, PostgreSQL engine is aimed at OLTP. This doesn't look like a good idea. Like adding side floats to a car and paddles to use it like a boat.<p>This goes against using the right tool for the right job.
This is really exciting! Columnar storage is something that the big boys like Microsoft and Oracle charge an arm and a leg for. You can currently get column-based SQL open-source databases, but this new FDW allows you to mix workloads on the same machine.
I am curious about how this thing compares to something like Amazon Redshift.<p>Briefly skimming, it looks pretty similar, except for the data compression part, which uses RCFile. It also supports more data types. If this being adapted by redshift or something else, I will be thrilled.
So with Cassandra you have a pretty nice, scalable, columnar DB with a SQL interface[0]. Not to mention, it's free and apache licensed so you can distribute it as part of your own software. I guess I've only looked at cassandra from the scope of a developer. Would a DBA prefer using a columnar version of PostgreSQL rather than using cassandra for free?<p>0:<a href="http://cassandra.apache.org/doc/cql/CQL.html" rel="nofollow">http://cassandra.apache.org/doc/cql/CQL.html</a><p>Edit: I didn't realize Citus Data was making the columnar postgres offering open source...that's great!
I have written a short blog post on how CitusDB's column store compares with purpose-built systems such as MonetDB. Repeatability ftw. Disclosure: I am part of the MonetDB development team.<p>CitusDB vs. MonetDB TPC-H Shootout
<a href="http://www.monetdb.org/content/citusdb-postgresql-column-store-vs-monetdb-tpc-h-shootout" rel="nofollow">http://www.monetdb.org/content/citusdb-postgresql-column-sto...</a>