TechEcho

16 comments

* [...] unified execution engine* accelerating data management systems* [...] streamlining their development* [...] consolidate and unify data management systemsCan someone translate this to English? I can see and recognize the individual meanings of the words, but I don't understand what they're trying to say.

评论 #32675371 未加载

评论 #32675467 未加载

评论 #32675229 未加载

tincoover 2 years ago

"In common usage scenarios, Velox takes a fully optimized query plan as input and performs the described computation. Considering Velox does not provide a SQL parser, a dataframe layer, or a query optimizer, it is usually not meant to be used directly by end-users; rather, it is mostly used by developers integrating and optimizing their compute engines."So the way you use it is that you describe some computation over your data as a query plan, and you implement a dataframe layer so Velox knows how to retrieve data from your database, and then Velox will efficiently execute the query plan? But it doesn't even optimize, so the problem it solves is that these systems like Spark and Presto don't efficiently execute optimized queries?This world is very far removed from me, does anyone have a concrete example of how Velox might help them? Why is Velox better than both Presto worker and Spark engine. Aren't those core components of the system?

评论 #32676136 未加载

评论 #32683168 未加载

rajko_radover 2 years ago

Anyone know how this compares to Photon by Databricks? That’s probably the benchmark + arch comp I’d like to see…<a href="https://www.databricks.com/product/photon" rel="nofollow">https://www.databricks.com/product/photon</a>

_gabe_over 2 years ago

> Ultimately, this fragmentation results in systems with different feature sets and inconsistent semantics — reducing the productivity of data users that need to interact with multiple engines to finish tasks.> In order to address these challenges and to create a stronger, more efficient data infrastructure for our own products and the world, Meta has created and open sourced Velox.Maybe I'm missing something here, but it sounds like a lot of separate services got created that solve the same or similar problem in slightly different ways. These services became hard to use because they were fragmented. So the solution is to keep all the services and build a complex service as a middle man?Why not unify the good parts of all the services into one central service? Then deprecate and transition off all the old fragmented ones? I understand that it's really hard to coordinate all of this and properly transition, but isn't the alternative of having to maintain many slightly different services (and now a complex middle man) more detrimental long term?

评论 #32694856 未加载

politicianover 2 years ago

> Velox leverages numerous runtime optimizations, such as filter and conjunct reordering, key normalization for array and hash-based aggregations and joins, dynamic filter pushdown, and adaptive column prefetching.That's a strong set of capabilities. I'm excited to see where this goes -- this could catalyze a Cambrian explosion of data systems that offload execution to Velox.

polskibusover 2 years ago

I see this as a continued effort of middleware being rewritten in C++, Rust and Go to replace Java - seems like common wisdom "Java can be as fast as C" has finally been abolished as this situation progresses (Kubernetes and other newer cloud middleware written in Go instead of Java, etc.)

评论 #32677221 未加载

评论 #32676226 未加载

评论 #32676153 未加载

评论 #32676296 未加载

评论 #32678386 未加载

评论 #32676243 未加载

评论 #32675966 未加载

debarshriover 2 years ago

It sounds very similar to apache beam. You can actually create runners for various data management systems [1][1] <a href="https://beam.apache.org/documentation/runners/" rel="nofollow">https://beam.apache.org/documentation/runners/</a>

评论 #32674823 未加载

lioetersover 2 years ago

<a href="https://github.com/facebookincubator/velox" rel="nofollow">https://github.com/facebookincubator/velox</a>

mborchover 2 years ago

Interesting to see how Databricks reacts to this given that they have their own Project Lightspeed (replacing the Spark execution engine).

whoevercaresover 2 years ago

Is this similar to Arrow datafusion but in C++? Tbh I think every hot new dataframe or analytics db has such components. The basic idea is not too different from the textbook at first glance.

whoevercaresover 2 years ago

This seems like analogous to LLVM, looks like we could (finally) build various front ends for analytics on tops of this?

liminalover 2 years ago

So this is an Apache Arrow database engine integrated into other databases? My main takeaway is that it's great to see more projects standardizing on Arrow and pushing it further down the stack.

gigatexalover 2 years ago

Cool that it’s being integrated into presto.

评论 #32674837 未加载

YetAnotherNickover 2 years ago

Is it same as YARN?

评论 #32676111 未加载

coding123over 2 years ago

it's airflow with more specifics around transportable data structures? instead of junky xcoms?

评论 #32674670 未加载

nspattakover 2 years ago

could this be a name conflict with this <a href="https://www.thermofisher.com/order/catalog/product/VELOX" rel="nofollow">https://www.thermofisher.com/order/catalog/product/VELOX</a> ?

16 comments

marginalia_nuover 2 years ago

评论 #32675371 未加载

评论 #32675467 未加载

评论 #32675229 未加载

tincoover 2 years ago

评论 #32676136 未加载

评论 #32683168 未加载

rajko_radover 2 years ago

_gabe_over 2 years ago

评论 #32694856 未加载

politicianover 2 years ago

polskibusover 2 years ago

评论 #32677221 未加载

评论 #32676226 未加载

评论 #32676153 未加载

评论 #32676296 未加载

评论 #32678386 未加载

评论 #32676243 未加载

评论 #32675966 未加载

debarshriover 2 years ago

评论 #32674823 未加载

lioetersover 2 years ago

<a href="https://github.com/facebookincubator/velox" rel="nofollow">https://github.com/facebookincubator/velox</a>

mborchover 2 years ago

Interesting to see how Databricks reacts to this given that they have their own Project Lightspeed (replacing the Spark execution engine).

whoevercaresover 2 years ago

Is this similar to Arrow datafusion but in C++? Tbh I think every hot new dataframe or analytics db has such components. The basic idea is not too different from the textbook at first glance.

whoevercaresover 2 years ago

This seems like analogous to LLVM, looks like we could (finally) build various front ends for analytics on tops of this?

liminalover 2 years ago

So this is an Apache Arrow database engine integrated into other databases? My main takeaway is that it's great to see more projects standardizing on Arrow and pushing it further down the stack.

gigatexalover 2 years ago

Cool that it’s being integrated into presto.

评论 #32674837 未加载

YetAnotherNickover 2 years ago

Is it same as YARN?

评论 #32676111 未加载

coding123over 2 years ago

it's airflow with more specifics around transportable data structures? instead of junky xcoms?

评论 #32674670 未加载

nspattakover 2 years ago

could this be a name conflict with this <a href="https://www.thermofisher.com/order/catalog/product/VELOX" rel="nofollow">https://www.thermofisher.com/order/catalog/product/VELOX</a> ?

Velox: An open-source unified execution engine

16 comments

Velox: An open-source unified execution engine

16 comments