TechEcho

If you're interested in the probabilistic approach, this is how it works: <a href="https://en.wikipedia.org/wiki/HyperLogLog" rel="nofollow">https://en.wikipedia.org/wiki/HyperLogLog</a><p>"The basis of the HyperLogLog algorithm is the observation that the cardinality of a multiset of uniformly-distributed random numbers can be estimated by calculating the maximum number of leading zeros in the binary representation of each number in the set. If the maximum number of leading zeros observed is n, an estimate for the number of distinct elements in the set is 2^n."

If anyone involved in the project is reading this the DNS entry for "www.logswan.org", available as a link on the github page, does not exist.

Logswan – Fast Web log analyzer using probabilistic data structures

2 comments

Logswan – Fast Web log analyzer using probabilistic data structures

2 comments