科技回声

Oops... The first sentence in the "Fast" section says "SIMD (Single input multiple data)".

Asking the stupid question here, but why create a new Apache project for this?<p>Apache Arrow seems to be targeting the use of SIMD which is a very JVM/Runtime dependent feature. If the runtime can't detect this out-of-the-box then create recognized method or some sort of intrinsic to coax the runtime to SIMD-ize the operation.<p>I understand the performance gains of this but why not add this functionality to existing projects like Parquet or HTable etc...<p>This just comes to mind: <a href="https://xkcd.com/927/" rel="nofollow">https://xkcd.com/927/</a>

Is this similar to how QlikView's in-memory engine works?

It really is a confusing title for the project. It's more of a high speed interchange format e.g. send data to Cassandra from Spark or Storm.<p>Nothing that end users will ever really have to know anything about.

I'm confused, is this just Structure of Arrays as a service for columnar data? It's not clear to me what this actually does.

Oops... The first sentence in the "Fast" section says "SIMD (Single input multiple data)".

Is this similar to how QlikView's in-memory engine works?

I'm confused, is this just Structure of Arrays as a service for columnar data? It's not clear to me what this actually does.

Apache Arrow – Powering Columnar In-Memory Analytics

5 条评论

Apache Arrow – Powering Columnar In-Memory Analytics

5 条评论