TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Kolmogorov Complexity - A Primer

101 点作者 vgnet大约 13 年前

6 条评论

chimeracoder大约 13 年前
&#62; So in a sense, the huge body of mathematics underlying probability has already failed us at this basic juncture because we cannot speak of how random one particular outcome of an experiment is.<p>This reminds me of the 'Is 14 a random number?' debate.<p>Part of the problem, though, is that we have terminology that is horribly misleading.<p>First, a random variable is neither random nor a variable. They're not variables, because they're functions that yield non-deterministic values - a <i>huge</i> distinction. You can't have a random number - the idea itself doesn't make any sense.<p>Second, the definition of 'random' is itself problematic - or at least contested. Randomness implies probability, but probability can be defined in two different (and incompatible) ways, one of which is essentially the inverse of the other. Ironically, the one that Keynes proved back in 1921 to be logically inconsistent is the one that is more commonly used today (probably because it is more intuitive and mathematically convenient... even if it is almost always incorrect!).<p>The question 'Is 14 a random number' doesn't really make any sense. 14 cannot be a random number; it can only be a number drawn from a random distribution. That may seem like it simply begs the question, but in fact, this subtle shift is incredibly important - determining whether a number was drawn from a random distribution is much easier to frame in terms of probability, and probability, not randomness, is the language of statistics.<p>Unfortunately, this is one of those cases where the two definitions of probability yield widely different answers. You could tell me that the answer is undefined, in almost exactly the same way that division-by-zero is undefined in mathematics. Or you could construct a model over all possible distributions of numbers, the probabilities associated with each of those distributions, and integrate accordingly to yield some (probably computationally unfeasible) functional answer.<p>In this school of thought, we <i>can</i> speak of how 'random' a particular outcome is - we're essentially partitioning the (potentially infinite) universe of functions <i>f</i> such that our value <i>v</i> is in the range of <i>f</i> into two categories: one designated as 'random' and the other designated as 'not random'. Then, we are determining the probability that our value was generated from one of the former, as opposed to the latter.<p>Either one would be correct ways of answering this second question, but neither one addresses the first question, which is essentially nonsensical. (Well, I guess the answer is 'no, 14 is not a random number, because a number cannot be random', but that's a bit of a cop-out!).
评论 #3877136 未加载
评论 #3876860 未加载
评论 #3877838 未加载
评论 #3876888 未加载
评论 #3876857 未加载
andrewcooke大约 13 年前
if people find that interesting, they might enjoy this book - <a href="http://mitpress.mit.edu/books/FLAOH/cbnhtml/" rel="nofollow">http://mitpress.mit.edu/books/FLAOH/cbnhtml/</a><p><a href="http://www.amazon.com/The-Computational-Beauty-Nature-Explorations/dp/0262561271" rel="nofollow">http://www.amazon.com/The-Computational-Beauty-Nature-Explor...</a>
dsr_大约 13 年前
Mentioning the pigeonhole principle by name might have been useful. It almost got defined here...
archgoon大约 13 年前
Well, I guess it basically goes to show how hard it is to determine the actual kolmogorov complexity (even once specifying a particular encoding of a Turing Machine) of a string, but the following is a new upper bound on the complexity (in Python) of the second string.<p>print '00011101001000101101001000101111010100000100111101'<p>print bin(128141569638717)[2:]<p>In general, any string of 1s and 0s will be compressible using this method in python once the binary number is greater than 2<i></i>12 (you break even at 2<i></i>11). However, again, I only claim this to be an upperbound ;). (also assuming that invoking built-ins isn't cheating).
评论 #3877120 未加载
评论 #3877596 未加载
strictfp大约 13 年前
With the given definition I assume that it would be impossible to have a Kolmogorov complexity higher than the size of the smallest PRNG program (+seed)? If so it might be of limited usefulness for larger strings.
评论 #3877968 未加载
eevilspock大约 13 年前
I'm no expert, but information theory answers the question of "randomness" more simply, intuitively and rigorously.<p>This notion of Kolmogorov Complexity doesn't seem to add any additional value.