科技回声

17 条评论

nalgeon超过 1 年前

Sam's visual tutorials are as good as ever (doggos are very important, by the way).And if you want to try some code, I've prepared three toy Bloom filter implementations to accompany the article:JavaScript: <a href="https://codapi.org/embed/?engine=browser&sandbox=javascript&src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.js" rel="nofollow">https://codapi.org/embed/?engine=browser&sandbox=javascript&...</a>Python: <a href="https://codapi.org/embed/?sandbox=python&src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.py" rel="nofollow">https://codapi.org/embed/?sandbox=python&src=gist:e7bde93f98...</a>Go: <a href="https://codapi.org/embed/?sandbox=go&src=gist:e7bde93f98c5e47ca38d359a5104fd88:bloom.go" rel="nofollow">https://codapi.org/embed/?sandbox=go&src=gist:e7bde93f98c5e4...</a>

samwho超过 1 年前

I'm the author of this post, and am happy to answer any questions. :)

评论 #39443118 未加载

评论 #39456271 未加载

svraghavan超过 1 年前

Thanks to the author for all the hard work. Its easy to consume and understand for beginners too.

评论 #39443170 未加载

naugtur超过 1 年前

I remember discussing bloom and cuckoo filters a while back and someone linked.to.great papers from late 2010s with much more advanced probabilistic structures that were handling modification much better or had other interesting properties. Anyone here knows where to look for these? I didn't take notes...

评论 #39447049 未加载

评论 #39448672 未加载

the_arun超过 1 年前

What are the business use cases we use bloom filters? Some examples would be great. Thanks!

评论 #39448842 未加载

评论 #39445346 未加载

评论 #39447295 未加载

评论 #39448538 未加载

评论 #39444027 未加载

franky47超过 1 年前

A wonderful article, thank you.Rather than using multiple hash functions, would it make more sense to use a single algorithm over (prefix | input), with k different prefixes? This may allow computing those hashes in parallel, using SIMD for example, and caching the prehash state of the prefixes.Edit: looks like there has been some research on this: <a href="https://ieeexplore.ieee.org/document/8462781" rel="nofollow">https://ieeexplore.ieee.org/document/8462781</a>

评论 #39446827 未加载

评论 #39448493 未加载

Crier1002大约 1 年前

this is a great article! I love how you explained it. here are my takeaways from it: - Bloom filter is a space-efficient data structure - You're trading off some accuracy (possibility of false positives) for the significantly smaller memory footprint and faster membership checks

Bloomfu超过 1 年前

Very good explanation of Bloom filters, but are they optimum from a complexity point of view given a false positive rate and a bounded number of bits? Are there other kind of filters if we relax de condición of zero false negative rate?, are there other better but more complex data structures able to provide the same effect?

评论 #39448197 未加载

lormayna超过 1 年前

For whom that are interested I suggest also to have a look on Cukoo Filter: <a href="https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf" rel="nofollow">https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf</a>

评论 #39466084 未加载

machekb超过 1 年前

I appreciate the author's dedication to making animations accessible!

评论 #39439867 未加载

dndn1超过 1 年前

These are beautiful interactives along with the rest by Sam!

评论 #39442352 未加载

martinky24超过 1 年前

Awesome resource! Bloom filters are one of my favorite data structure (that I wish I got to employ more often in my line of work… but alas).

评论 #39443196 未加载

Epicism超过 1 年前

Very cool! I was going to read up on bloom filters and this popped up the next day!

评论 #39452593 未加载

who-shot-jr超过 1 年前

Amazing! Very well explained.

评论 #39444162 未加载

ctxc超过 1 年前

Flows beautifully. Looking forward to the next one Sam!

评论 #39444784 未加载

grupthink超过 1 年前

Would it be better to store the hash of each output into 3 different arrays?e.g. currently, it's doing:<pre><code> [input] -> [3 hashes] -> [single storage] </code></pre> but, I'm wondering if it's better to do:<pre><code> [input] -> [hash a] -> [storage a] -> [hash b] -> [storage b] -> [hash c] -> [storage c] </code></pre> ...this way, the output of the 3 hashes don't affect each other during the set and membership-check. I wonder how that would affect how much more data you can store. I'm sure someone has considered this.

评论 #39450924 未加载

nhggfu超过 1 年前

very well written and informative.

评论 #39444168 未加载

17 条评论

nalgeon超过 1 年前

samwho超过 1 年前

I'm the author of this post, and am happy to answer any questions. :)

评论 #39443118 未加载

评论 #39456271 未加载

svraghavan超过 1 年前

Thanks to the author for all the hard work. Its easy to consume and understand for beginners too.

评论 #39443170 未加载

naugtur超过 1 年前

评论 #39447049 未加载

评论 #39448672 未加载

the_arun超过 1 年前

What are the business use cases we use bloom filters? Some examples would be great. Thanks!

评论 #39448842 未加载

评论 #39445346 未加载

评论 #39447295 未加载

评论 #39448538 未加载

评论 #39444027 未加载

franky47超过 1 年前

评论 #39446827 未加载

评论 #39448493 未加载

Crier1002大约 1 年前

Bloomfu超过 1 年前

评论 #39448197 未加载

lormayna超过 1 年前

评论 #39466084 未加载

machekb超过 1 年前

I appreciate the author's dedication to making animations accessible!

评论 #39439867 未加载

dndn1超过 1 年前

These are beautiful interactives along with the rest by Sam!

评论 #39442352 未加载

martinky24超过 1 年前

Awesome resource! Bloom filters are one of my favorite data structure (that I wish I got to employ more often in my line of work… but alas).

评论 #39443196 未加载

Epicism超过 1 年前

Very cool! I was going to read up on bloom filters and this popped up the next day!

A visual interactive guide to Bloom filters

17 条评论

A visual interactive guide to Bloom filters

17 条评论