科技回声

9 条评论

ssivark超过 6 年前

To summarize succinctly, KL(q||p) quantifies how badly you screw up if the true distribution is “q” and you instead think it is “p”.Note that KL divergence is not symmetric! Eg: If the true distribution of coin tosses is 100% heads and your model has 50/50, you won’t mess up big — compared with when the true coin is 50/50 and your model is 100 percent heads (and you would have been willing to bet a LOT of money that there will be no tails in the outcome).In this technical sense, it is preferable to be conservative than overly confident.

评论 #17919522 未加载

评论 #17919033 未加载

Patient0超过 6 年前

I've recently discovered this excellent lecture series by David Mackay available on YouTube: <a href="https://youtu.be/y5VdtQSqiAI" rel="nofollow">https://youtu.be/y5VdtQSqiAI</a>He also wrote the accompanying text book which is available for free download: <a href="http://www.inference.phy.cam.ac.uk/itprnn/book.pdf" rel="nofollow">http://www.inference.phy.cam.ac.uk/itprnn/book.pdf</a>I was really impressed by these lectures, and was dismayed to learn that he died from cancer a couple of years ago.

beagle3超过 6 年前

I wish information theory was part of math/cs/engineering curriculum in more places.The basics are fundamental to many areas of science (especially if they touch probability in any way), intuitive, and mostly accessible with just a couple of handwaves.

评论 #17918481 未加载

评论 #17917981 未加载

评论 #17922560 未加载

atrudeau超过 6 年前

Shannon's dissertation is a great introduction (:p) to entropy. <a href="https://dspace.mit.edu/handle/1721.1/11173" rel="nofollow">https://dspace.mit.edu/handle/1721.1/11173</a>

cryptonector超过 6 年前

This divergence feels a lot like making a Huffman encoding table given a prediction of probability distribution then measuring how efficient that turns out to be by comparison to a Huffman encoding table based on the probability distribution you get from the real data after the fact.

jules超过 6 年前

The KL divergence is also called relative entropy. Unlike the ordinary entropy, relative entropy is invariant under parameter transformations. The maximum relative entropy principle generalises Bayesian inference. The distribution relative to which you're computing the entropy plays the role of the prior.By the way, I find the following way to rewrite the entropy easier to understand because all quantities are positive:sum(-p_i log(p_i)) = sum(p_i log(1/p_i)) = E[log(1/p_i)]log(1/p_i) tells you how many bits you need to encode an event with probability p_i. The more unlikely the event, the more bits you need. The entropy is the expected number of bits you need.

derEitel超过 6 年前

Great, intuitive explanations with a nice mix of code and formulas. Only I found the GIFs to be very annoying while reading, especially as they do not add to the content.

caiocaiocaio超过 6 年前

Lovely article, but grey-on-white and a small, thin display font meant I had to go into developer tools to be able to read it without getting a headache.

评论 #17918494 未加载

doombolt超过 6 年前

I have a hunch that space engineers have suddently invented Huffman coding.(Which leads to a general observation of "just throw in transparent compression instead of optimizing your data format")EDIT: s/encryption/compression/

评论 #17919982 未加载

9 条评论

ssivark超过 6 年前

评论 #17919522 未加载

评论 #17919033 未加载

Patient0超过 6 年前

beagle3超过 6 年前

评论 #17918481 未加载

评论 #17917981 未加载

评论 #17922560 未加载

atrudeau超过 6 年前

Shannon's dissertation is a great introduction (:p) to entropy. <a href="https://dspace.mit.edu/handle/1721.1/11173" rel="nofollow">https://dspace.mit.edu/handle/1721.1/11173</a>

cryptonector超过 6 年前

jules超过 6 年前

derEitel超过 6 年前

Great, intuitive explanations with a nice mix of code and formulas. Only I found the GIFs to be very annoying while reading, especially as they do not add to the content.

caiocaiocaio超过 6 年前

Lovely article, but grey-on-white and a small, thin display font meant I had to go into developer tools to be able to read it without getting a headache.

评论 #17918494 未加载

doombolt超过 6 年前

评论 #17919982 未加载

What is the Kullback-Leibler divergence?

9 条评论

What is the Kullback-Leibler divergence?

9 条评论