TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Velox: Meta's Unified Execution Engine [pdf]

99 pointsby luuabout 1 year ago

7 comments

jauntywundrkindabout 1 year ago
Python&#x27;s Substrait seems like the biggest&#x2F;most-used competitor-ish out there. I&#x27;d love some compare &amp; contrast; my sense is that Substrait has a smaller ambition, more wants to be a language for talking about execution rather than a full on optimization&#x2F;execution engine. <a href="https:&#x2F;&#x2F;github.com&#x2F;substrait-io&#x2F;substrait">https:&#x2F;&#x2F;github.com&#x2F;substrait-io&#x2F;substrait</a> .<p>(Edit: ah, there&#x27;s a recent talk discussing PyVelox trying to get Substrait integration. <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=l_kHxkGkNRg#t=18m22s" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=l_kHxkGkNRg#t=18m22s</a> . However there&#x27;s also discussion about the un-maintainedness of some of the current Substrait work here; unclear status. <a href="https:&#x2F;&#x2F;github.com&#x2F;facebookincubator&#x2F;velox&#x2F;issues&#x2F;8895">https:&#x2F;&#x2F;github.com&#x2F;facebookincubator&#x2F;velox&#x2F;issues&#x2F;8895</a>)<p>We can also see from the Apache Arrow DataFusion discussion that they too see themselves as a bit of a Velox competitor. <a href="https:&#x2F;&#x2F;github.com&#x2F;apache&#x2F;arrow-datafusion&#x2F;discussions&#x2F;6441">https:&#x2F;&#x2F;github.com&#x2F;apache&#x2F;arrow-datafusion&#x2F;discussions&#x2F;6441</a><p>It&#x27;s cool to see this space mature. I like that even Velox sees that Apache Arrow (underlying Apache Arrow DataFusion too) is industry standard tech that they ought work with. <a href="https:&#x2F;&#x2F;engineering.fb.com&#x2F;2024&#x2F;02&#x2F;20&#x2F;developer-tools&#x2F;velox-apache-arrow-15-composable-data-management&#x2F;" rel="nofollow">https:&#x2F;&#x2F;engineering.fb.com&#x2F;2024&#x2F;02&#x2F;20&#x2F;developer-tools&#x2F;velox-...</a><p>Theres a solid Influx post talks to some of how they are composing the assorted technologies to build they next gen 3.0, which I find helpful for getting a sense of how all the pieces of a modern high-performance data engine slot together. <a href="https:&#x2F;&#x2F;www.influxdata.com&#x2F;blog&#x2F;flight-datafusion-arrow-parquet-fdap-architecture-influxdb&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.influxdata.com&#x2F;blog&#x2F;flight-datafusion-arrow-parq...</a>
评论 #39820070 未加载
评论 #39822669 未加载
sakrasabout 1 year ago
My general take is that while the idea of composability is good, the implementations of these things are just frankly not of high quality. Velox&#x2F;Acero in particular are all plagued by what I&#x27;ve come to call &quot;Java syndrome&quot;, where everything is written as idiomatic Java but with C++ syntax. Virtual methods, std::shared_ptr galore (in lieu of garbage collection), random heap allocations, etc. As a result these systems tend to be bloated and significantly slower than they need to be.<p>DuckDB is good though, and I predict its quality of implementation will keep &quot;monolithic databases&quot; relevant for a while longer.
评论 #39824949 未加载
评论 #39831263 未加载
评论 #39825230 未加载
redskyluanabout 1 year ago
Velox could be competitor of datafusion. It is more focus on execution engine and could be great to integrate to other high performance databases.<p>Database will be split into pieces and rebuild!
评论 #39822138 未加载
评论 #39824279 未加载
sgt101about 1 year ago
I wonder how many of this sort of FAANG project really get used where they are built. I went for an interview at a FAANG years ago to work on a very big consumer product (when it was in relative infancy) and expected to find a hyper tech data backend to use... they told me that they were using mySQL.<p>I didn&#x27;t get the job so maybe they were just joking around with me - but the general despair that they evinced about their data situation makes me wonder!
评论 #39821627 未加载
评论 #39821844 未加载
评论 #39822738 未加载
评论 #39821862 未加载
评论 #39821574 未加载
pvgabout 1 year ago
A thread from late 2022: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32673873">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32673873</a>
HermitXabout 1 year ago
To the best of my knowledge, Meta has significantly reduced its investment in the Velox project. Apart from Meta, I&#x27;m not aware of any other major company that really uses Velox in a production environment. Frankly speaking, Velox may have already missed the window of opportunity for rapid development. If you&#x27;re looking for a vectorized execution engine, you could consider ClickHouse (www.clickhouse.com) or StarRocks (www.starrocks.io). If your data analysis scenarios require more multi-table join operations, StarRocks is clearly a better choice.
评论 #39822823 未加载
评论 #39847141 未加载
zX41ZdbWabout 1 year ago
Many ideas look like they were influenced by ClickHouse, and some are direct copies. I&#x27;m surprised they didn&#x27;t provide references to ClickHouse, where the implementations are proven in production in the first place.
评论 #39820646 未加载
评论 #39847162 未加载
评论 #39820384 未加载