TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Using Parquet's Bloom Filters

53 pointsby pauldix12 months ago

2 comments

appplication12 months ago
One thing I have wondered: would it make sense to reduce file size? Generally advice I’ve seen is to keep files to around 250mb-1gb, but if you’re leaning heavily on bloom filters it feels like it could make sense to reduce the number of files to reduce the amount that would trigger the per-file filter.
darkflame9112 months ago
With large datasets, wouldn't partitioning the data on low cardinality columns give the same benefit without the space overhead?
评论 #40509079 未加载
评论 #40510104 未加载
评论 #40516092 未加载