TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Using Parquet's Bloom Filters

53 点作者 pauldix12 个月前

2 条评论

appplication12 个月前
One thing I have wondered: would it make sense to reduce file size? Generally advice I’ve seen is to keep files to around 250mb-1gb, but if you’re leaning heavily on bloom filters it feels like it could make sense to reduce the number of files to reduce the amount that would trigger the per-file filter.
darkflame9112 个月前
With large datasets, wouldn't partitioning the data on low cardinality columns give the same benefit without the space overhead?
评论 #40509079 未加载
评论 #40510104 未加载
评论 #40516092 未加载