TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Velox: An open-source unified execution engine

151 pointsby polyrandover 2 years ago

16 comments

marginalia_nuover 2 years ago
* [...] unified execution engine<p>* accelerating data management systems<p>* [...] streamlining their development<p>* [...] consolidate and unify data management systems<p>Can someone translate this to English? I can see and recognize the individual meanings of the words, but I don&#x27;t understand what they&#x27;re trying to say.
评论 #32675371 未加载
评论 #32675467 未加载
评论 #32675229 未加载
tincoover 2 years ago
&quot;In common usage scenarios, Velox takes a fully optimized query plan as input and performs the described computation. Considering Velox does not provide a SQL parser, a dataframe layer, or a query optimizer, it is usually not meant to be used directly by end-users; rather, it is mostly used by developers integrating and optimizing their compute engines.&quot;<p>So the way you use it is that you describe some computation over your data as a query plan, and you implement a dataframe layer so Velox knows how to retrieve data from your database, and then Velox will efficiently execute the query plan? But it doesn&#x27;t even optimize, so the problem it solves is that these systems like Spark and Presto don&#x27;t efficiently execute optimized queries?<p>This world is very far removed from me, does anyone have a concrete example of how Velox might help them? Why is Velox better than both Presto worker and Spark engine. Aren&#x27;t those core components of the system?
评论 #32676136 未加载
评论 #32683168 未加载
rajko_radover 2 years ago
Anyone know how this compares to Photon by Databricks? That’s probably the benchmark + arch comp I’d like to see…<p><a href="https:&#x2F;&#x2F;www.databricks.com&#x2F;product&#x2F;photon" rel="nofollow">https:&#x2F;&#x2F;www.databricks.com&#x2F;product&#x2F;photon</a>
_gabe_over 2 years ago
&gt; Ultimately, this fragmentation results in systems with different feature sets and inconsistent semantics — reducing the productivity of data users that need to interact with multiple engines to finish tasks.<p>&gt; In order to address these challenges and to create a stronger, more efficient data infrastructure for our own products and the world, Meta has created and open sourced Velox.<p>Maybe I&#x27;m missing something here, but it sounds like a lot of separate services got created that solve the same or similar problem in slightly different ways. These services became hard to use because they were fragmented. So the solution is to keep <i>all</i> the services and build a complex service as a middle man?<p>Why not unify the good parts of all the services into one central service? Then deprecate and transition off all the old fragmented ones? I understand that it&#x27;s really hard to coordinate all of this and properly transition, but isn&#x27;t the alternative of having to maintain many slightly different services (and now a complex middle man) more detrimental long term?
评论 #32694856 未加载
politicianover 2 years ago
&gt; Velox leverages numerous runtime optimizations, such as filter and conjunct reordering, key normalization for array and hash-based aggregations and joins, dynamic filter pushdown, and adaptive column prefetching.<p>That&#x27;s a strong set of capabilities. I&#x27;m excited to see where this goes -- this could catalyze a Cambrian explosion of data systems that offload execution to Velox.
polskibusover 2 years ago
I see this as a continued effort of middleware being rewritten in C++, Rust and Go to replace Java - seems like common wisdom &quot;Java can be as fast as C&quot; has finally been abolished as this situation progresses (Kubernetes and other newer cloud middleware written in Go instead of Java, etc.)
评论 #32677221 未加载
评论 #32676226 未加载
评论 #32676153 未加载
评论 #32676296 未加载
评论 #32678386 未加载
评论 #32676243 未加载
评论 #32675966 未加载
debarshriover 2 years ago
It sounds very similar to apache beam. You can actually create runners for various data management systems [1]<p>[1] <a href="https:&#x2F;&#x2F;beam.apache.org&#x2F;documentation&#x2F;runners&#x2F;" rel="nofollow">https:&#x2F;&#x2F;beam.apache.org&#x2F;documentation&#x2F;runners&#x2F;</a>
评论 #32674823 未加载
lioetersover 2 years ago
<a href="https:&#x2F;&#x2F;github.com&#x2F;facebookincubator&#x2F;velox" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebookincubator&#x2F;velox</a>
mborchover 2 years ago
Interesting to see how Databricks reacts to this given that they have their own Project Lightspeed (replacing the Spark execution engine).
whoevercaresover 2 years ago
Is this similar to Arrow datafusion but in C++? Tbh I think every hot new dataframe or analytics db has such components. The basic idea is not too different from the textbook at first glance.
whoevercaresover 2 years ago
This seems like analogous to LLVM, looks like we could (finally) build various front ends for analytics on tops of this?
liminalover 2 years ago
So this is an Apache Arrow database engine integrated into other databases? My main takeaway is that it&#x27;s great to see more projects standardizing on Arrow and pushing it further down the stack.
gigatexalover 2 years ago
Cool that it’s being integrated into presto.
评论 #32674837 未加载
YetAnotherNickover 2 years ago
Is it same as YARN?
评论 #32676111 未加载
coding123over 2 years ago
it&#x27;s airflow with more specifics around transportable data structures? instead of junky xcoms?
评论 #32674670 未加载
nspattakover 2 years ago
could this be a name conflict with this <a href="https:&#x2F;&#x2F;www.thermofisher.com&#x2F;order&#x2F;catalog&#x2F;product&#x2F;VELOX" rel="nofollow">https:&#x2F;&#x2F;www.thermofisher.com&#x2F;order&#x2F;catalog&#x2F;product&#x2F;VELOX</a> ?