TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Comparison of Data Lake Table Formats (Iceberg, Hudi and Delta Lake)

96 点作者 anhldbk将近 3 年前

11 条评论

henrydark将近 3 年前
A major problem with these table formats that will surface soon enough is that they use serial numerical ordering for versions.<p>It&#x27;s like inventing SVN for data. Soon enough git will have to be invented as well.
评论 #31727638 未加载
评论 #31852535 未加载
evilturnip将近 3 年前
We&#x27;re currently looking into datalake implementations. Right now, we only have 1 or 2 data sources. Current thinking is reading them on the fly, combine them using pandas dataframe and query that. Anyone have experience with doing something similar?
评论 #31727282 未加载
评论 #31729276 未加载
评论 #31732719 未加载
评论 #31727513 未加载
anonymousDan将近 3 年前
How does the concept of a table here differ from that of a standard relational table (if at all)? Is it that the table is a logical abstraction over a distributed set of files?
评论 #31728432 未加载
评论 #31728401 未加载
评论 #31729646 未加载
ttunguz将近 3 年前
Does anyone have experience running either of these three in production?
评论 #31729687 未加载
评论 #31726151 未加载
评论 #31725882 未加载
divbzero将近 3 年前
Does anyone have good real life stories of how data from a data lake made a real difference in a product or a business?
评论 #31729244 未加载
venki80将近 3 年前
Wondering if this is basically what all data lakes will look like in the future. All data stored in these table formats…
评论 #31727136 未加载
评论 #31726466 未加载
pid-1将近 3 年前
The repo comparison was really cool. I guess that could be made into a product.
评论 #31725085 未加载
diptnt将近 3 年前
Thanks for bringing this comparison out!
评论 #31729694 未加载
hrosen将近 3 年前
Helpful to see a concise comparison!
评论 #31729696 未加载
ajantha将近 3 年前
Nicely summarised and visualised :+1
评论 #31729701 未加载
broberts2261将近 3 年前
Great comparison!