Rather than re-implement this it might be better to use an existing library. In Python, I've used pylevenshtein:<p><a href="http://code.google.com/p/pylevenshtein/" rel="nofollow">http://code.google.com/p/pylevenshtein/</a><p>However, I've found the Jaro-Winkler distance to be more useful than the Levenshtein. You can find good implementations for Jaro out there as well.<p><a href="http://en.wikipedia.org/wiki/Jaro-Winkler_distance" rel="nofollow">http://en.wikipedia.org/wiki/Jaro-Winkler_distance</a>
Clever title. In my experience, in a practical sense, tries tend to be very memory intensive with only a marginal increase in speed over other alternatives like simple hash lookups. However, this is a very interesting and more typical use-case for a trie. I've always like it better than using Hamming distance which still seems to get recommended in undergraduate texts.<p>Oh, and I glossed over the perl code, until I reread it and realized it was sleep code. First time hearing about this language...interesting.<p><a href="http://sleep.dashnine.org/" rel="nofollow">http://sleep.dashnine.org/</a>
Norvig posted a similar algorithm using a hashtable instead of a trie: edits() in ngrams.py at <a href="http://norvig.com/ngrams/" rel="nofollow">http://norvig.com/ngrams/</a><p>You use a table of all prefixes of all dictionary words. This might be more or less efficient than a trie, depending on implementation; but in interpreted Python the built-in hashtables are bound to win.<p>(It's descended from code I sent in response to <a href="http://norvig.com/spell-correct.html" rel="nofollow">http://norvig.com/spell-correct.html</a>, rewritten to return a dict of candidates each paired with a description of how it's different from the original word. There's a bug of sorts in that the result set misses a very few candidates his first article's code finds, unless you extend the edit-distance cutoff; I only noticed the problem after I'd mailed him the code. I'm not sure if the OP's algorithm has the same shortcoming -- I haven't read it closely.)
I've used Burkhard-Keller trees to do something like this:<p><a href="http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees" rel="nofollow">http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK...</a><p>I don't know off the top of my head which one is more efficient.