TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

HyperLogSandwich

75 pointsby boydabout 10 years ago

3 comments

t1mabout 10 years ago
Note that intersections using HLLs don&#x27;t work in general. The error rates become gigantic for sets that have small intersections, or whose cardinality differs greatly.<p>A general solution to cardinality estimations for intersections is to just use HLLs for union operations, and keep a MinHash sketch for each key as well to perform intersections.<p>There is a very good analysis of this technique over at <a href="http:&#x2F;&#x2F;tech.adroll.com&#x2F;blog&#x2F;data&#x2F;2013&#x2F;07&#x2F;10&#x2F;hll-minhash.html" rel="nofollow">http:&#x2F;&#x2F;tech.adroll.com&#x2F;blog&#x2F;data&#x2F;2013&#x2F;07&#x2F;10&#x2F;hll-minhash.html</a>
评论 #9491242 未加载
bpodgurskyabout 10 years ago
It would be really great if this was implemented in stream-lib (<a href="https:&#x2F;&#x2F;github.com&#x2F;addthis&#x2F;stream-lib" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;addthis&#x2F;stream-lib</a>), which is a pretty standard library for HLL counters.<p>Also, the community supporting that lib is very familiar with the mathematics behind these structures, and would be be better able to critique the error rates and such.
Hengjieabout 10 years ago
Can someone explain to me like I&#x27;m 5 how this is useful?
评论 #9489770 未加载