Arrow is <i>the most important</i> thing happening in the data ecosystem right now. It's going to allow you to run your choice of execution engine, on top of your choice of data store, as though they are designed to work together. It will mostly be invisible to users, the key thing that needs to happen is that all the producers and consumers of batch data need to adopt Arrow as the common interchange format.<p>BigQuery recently implemented the storage API, which allows you to read BQ tables, in parallel, in Arrow format: <a href="https://cloud.google.com/bigquery/docs/reference/storage" rel="nofollow">https://cloud.google.com/bigquery/docs/reference/storage</a><p>Snowflake has adopted Arrow as the in-memory format for their JDBC driver, though to my knowledge there is still no way to access data in <i>parallel</i> from Snowflake, other than to export to S3.<p>As Arrow spreads across the ecosystem, users are going to start discovering that they can store data in one system and query it in another, at full speed, and it's going to be amazing.