It seems really weird that every single one of these spells "spears" correctly.<p>Is this supposed to be a representative list of misspellings? Or given that it's posted under /jobs, is it data for an interview exercise of something like that? (The page doesn't say.)
I remembered seeing these stats before. They are a few years old, right? Google lists the publishing date for those stats as 1 April 2002. Seems like a subset too.<p>I really enjoy using Google's spelling correction, but what really would blow my mind, is if I one day can search for:<p><pre><code> ntoymru d[rstd
</code></pre>
and get results for "britney spears". It seldomly happens when I calibrate my index fingers over the wrong keys.
Now, when I first saw this, I hypothesized that the number of mistakes would go down roughly with edit distance. Not exactly true, see for yourself:<p><a href="https://docs.google.com/spreadsheet/ccc?key=0Ao73DTH98IRgdC1BM1NrTDFJOWFaU0JWemRDeko2OFE&hl=en_US" rel="nofollow">https://docs.google.com/spreadsheet/ccc?key=0Ao73DTH98IRgdC1...</a><p>I used a perl module (<a href="http://search.cpan.org/dist/Text-Levenshtein/Levenshtein.pm" rel="nofollow">http://search.cpan.org/dist/Text-Levenshtein/Levenshtein.pm</a>) to calculate the edit distances.
Do they only have those stats for "britney spears"? (And if so...why?)<p>I looked (admittedly not very hard) to see if this was some kind of google lab where you query with some phrase and see the misspellings, but to no avail. That would be interesting - anyone know links?
That's interesting, but maybe would have been better if they'd discounted the phrases where the user clicked "Show results for [mis-spelled query] instead"<p>Some of those look like legitimate queries..
Actually, it seems these are only the spelling stats for misspelling 'britney' - every single 'spears' is spelt correctly. I was expecting to at least see 'britney speares' listed.