There is some technique called "normalized compression distance" that does sort of it. It uses compression to compare how similar some data is to some another.<p>For a similar problem, you can work like it was answered here: <a href="http://reverseengineering.stackexchange.com/questions/2897/tool-or-data-for-analysis-of-binary-code-to-detect-cpu-architecture/2900#2900" rel="nofollow">http://reverseengineering.stackexchange.com/questions/2897/t...</a>
I always thought this idea could be greatly expanded upon.<p>I've seen it used to guess the native language of a text file based on the compressed input. I always believed this could be used as a sort of universal translator. You could compress the audio sounds of birds, throw this algorithm at it, and extract meaningful content.
Cantor.Dust - the future was here, but turned out to be vaporware :(<p><a href="https://www.youtube.com/watch?v=4bM3Gut1hIk" rel="nofollow">https://www.youtube.com/watch?v=4bM3Gut1hIk</a>