TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

A visual interactive guide to Bloom filters

286 pointsby flyingskyover 1 year ago

17 comments

nalgeonover 1 year ago
Sam&#x27;s visual tutorials are as good as ever (doggos are very important, by the way).<p>And if you want to try some code, I&#x27;ve prepared three toy Bloom filter implementations to accompany the article:<p>JavaScript: <a href="https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?engine=browser&amp;sandbox=javascript&amp;src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.js" rel="nofollow">https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?engine=browser&amp;sandbox=javascript&amp;...</a><p>Python: <a href="https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?sandbox=python&amp;src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.py" rel="nofollow">https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?sandbox=python&amp;src=gist:e7bde93f98...</a><p>Go: <a href="https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?sandbox=go&amp;src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.go" rel="nofollow">https:&#x2F;&#x2F;codapi.org&#x2F;embed&#x2F;?sandbox=go&amp;src=gist:e7bde93f98c5e4...</a>
samwhoover 1 year ago
I&#x27;m the author of this post, and am happy to answer any questions. :)
评论 #39443118 未加载
评论 #39456271 未加载
svraghavanover 1 year ago
Thanks to the author for all the hard work. Its easy to consume and understand for beginners too.
评论 #39443170 未加载
naugturover 1 year ago
I remember discussing bloom and cuckoo filters a while back and someone linked.to.great papers from late 2010s with much more advanced probabilistic structures that were handling modification much better or had other interesting properties. Anyone here knows where to look for these? I didn&#x27;t take notes...
评论 #39447049 未加载
评论 #39448672 未加载
the_arunover 1 year ago
What are the business use cases we use bloom filters? Some examples would be great. Thanks!
评论 #39448842 未加载
评论 #39445346 未加载
评论 #39447295 未加载
评论 #39448538 未加载
评论 #39444027 未加载
franky47over 1 year ago
A wonderful article, thank you.<p>Rather than using multiple hash functions, would it make more sense to use a single algorithm over (prefix | input), with k different prefixes? This may allow computing those hashes in parallel, using SIMD for example, and caching the prehash state of the prefixes.<p>Edit: looks like there has been some research on this: <a href="https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;document&#x2F;8462781" rel="nofollow">https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;document&#x2F;8462781</a>
评论 #39446827 未加载
评论 #39448493 未加载
Crier1002about 1 year ago
this is a great article! I love how you explained it. here are my takeaways from it: - Bloom filter is a space-efficient data structure - You&#x27;re trading off some accuracy (possibility of false positives) for the significantly smaller memory footprint and faster membership checks
Bloomfuover 1 year ago
Very good explanation of Bloom filters, but are they optimum from a complexity point of view given a false positive rate and a bounded number of bits? Are there other kind of filters if we relax de condición of zero false negative rate?, are there other better but more complex data structures able to provide the same effect?
评论 #39448197 未加载
lormaynaover 1 year ago
For whom that are interested I suggest also to have a look on Cukoo Filter: <a href="https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~dga&#x2F;papers&#x2F;cuckoo-conext2014.pdf" rel="nofollow">https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~dga&#x2F;papers&#x2F;cuckoo-conext2014.pdf</a>
评论 #39466084 未加载
machekbover 1 year ago
I appreciate the author&#x27;s dedication to making animations accessible!
评论 #39439867 未加载
dndn1over 1 year ago
These are beautiful interactives along with the rest by Sam!
评论 #39442352 未加载
martinky24over 1 year ago
Awesome resource! Bloom filters are one of my favorite data structure (that I wish I got to employ more often in my line of work… but alas).
评论 #39443196 未加载
Epicismover 1 year ago
Very cool! I was going to read up on bloom filters and this popped up the next day!
评论 #39452593 未加载
who-shot-jrover 1 year ago
Amazing! Very well explained.
评论 #39444162 未加载
ctxcover 1 year ago
Flows beautifully. Looking forward to the next one Sam!
评论 #39444784 未加载
grupthinkover 1 year ago
Would it be better to store the hash of each output into 3 different arrays?<p>e.g. currently, it&#x27;s doing:<p><pre><code> [input] -&gt; [3 hashes] -&gt; [single storage] </code></pre> but, I&#x27;m wondering if it&#x27;s better to do:<p><pre><code> [input] -&gt; [hash a] -&gt; [storage a] -&gt; [hash b] -&gt; [storage b] -&gt; [hash c] -&gt; [storage c] </code></pre> ...this way, the output of the 3 hashes don&#x27;t affect each other during the set and membership-check. I wonder how that would affect how much more data you can store. I&#x27;m sure someone has considered this.
评论 #39450924 未加载
nhggfuover 1 year ago
very well written and informative.
评论 #39444168 未加载