TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

HyperLogSandwich

75 点作者 boyd大约 10 年前

3 条评论

t1m大约 10 年前
Note that intersections using HLLs don&#x27;t work in general. The error rates become gigantic for sets that have small intersections, or whose cardinality differs greatly.<p>A general solution to cardinality estimations for intersections is to just use HLLs for union operations, and keep a MinHash sketch for each key as well to perform intersections.<p>There is a very good analysis of this technique over at <a href="http:&#x2F;&#x2F;tech.adroll.com&#x2F;blog&#x2F;data&#x2F;2013&#x2F;07&#x2F;10&#x2F;hll-minhash.html" rel="nofollow">http:&#x2F;&#x2F;tech.adroll.com&#x2F;blog&#x2F;data&#x2F;2013&#x2F;07&#x2F;10&#x2F;hll-minhash.html</a>
评论 #9491242 未加载
bpodgursky大约 10 年前
It would be really great if this was implemented in stream-lib (<a href="https:&#x2F;&#x2F;github.com&#x2F;addthis&#x2F;stream-lib" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;addthis&#x2F;stream-lib</a>), which is a pretty standard library for HLL counters.<p>Also, the community supporting that lib is very familiar with the mathematics behind these structures, and would be be better able to critique the error rates and such.
Hengjie大约 10 年前
Can someone explain to me like I&#x27;m 5 how this is useful?
评论 #9489770 未加载