科技回声

8 条评论

mrshoe超过 15 年前

Rather than re-implement this it might be better to use an existing library. In Python, I've used pylevenshtein:<a href="http://code.google.com/p/pylevenshtein/" rel="nofollow">http://code.google.com/p/pylevenshtein/</a>However, I've found the Jaro-Winkler distance to be more useful than the Levenshtein. You can find good implementations for Jaro out there as well.<a href="http://en.wikipedia.org/wiki/Jaro-Winkler_distance" rel="nofollow">http://en.wikipedia.org/wiki/Jaro-Winkler_distance</a>

评论 #1087224 未加载

elblanco超过 15 年前

Clever title. In my experience, in a practical sense, tries tend to be very memory intensive with only a marginal increase in speed over other alternatives like simple hash lookups. However, this is a very interesting and more typical use-case for a trie. I've always like it better than using Hamming distance which still seems to get recommended in undergraduate texts.Oh, and I glossed over the perl code, until I reread it and realized it was sleep code. First time hearing about this language...interesting.<a href="http://sleep.dashnine.org/" rel="nofollow">http://sleep.dashnine.org/</a>

评论 #1087186 未加载

abecedarius超过 15 年前

Norvig posted a similar algorithm using a hashtable instead of a trie: edits() in ngrams.py at <a href="http://norvig.com/ngrams/" rel="nofollow">http://norvig.com/ngrams/</a>You use a table of all prefixes of all dictionary words. This might be more or less efficient than a trie, depending on implementation; but in interpreted Python the built-in hashtables are bound to win.(It's descended from code I sent in response to <a href="http://norvig.com/spell-correct.html" rel="nofollow">http://norvig.com/spell-correct.html</a>, rewritten to return a dict of candidates each paired with a description of how it's different from the original word. There's a bug of sorts in that the result set misses a very few candidates his first article's code finds, unless you extend the edit-distance cutoff; I only noticed the problem after I'd mailed him the code. I'm not sure if the OP's algorithm has the same shortcoming -- I haven't read it closely.)

tom_pinckney超过 15 年前

Careful what dictionary you use...my /usr/share/dict/words has things in it that might not be polite/acceptable to suggest.

llimllib超过 15 年前

<a href="http://en.wikipedia.org/wiki/Patricia_tree" rel="nofollow">http://en.wikipedia.org/wiki/Patricia_tree</a> would be an improvement.

评论 #1087190 未加载

yan超过 15 年前

Do people pronounce trie as 'try' or 'tree'? I pronounce it as 'tree,' which is why it took me a second to catch the pun.

评论 #1087335 未加载

评论 #1087656 未加载

评论 #1087275 未加载

评论 #1087408 未加载

indigoviolet超过 15 年前

I've used Burkhard-Keller trees to do something like this:<a href="http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees" rel="nofollow">http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK...</a>I don't know off the top of my head which one is more efficient.

kristianp超过 15 年前

What is the algorithmic complexity of your trie code compared to Norvig's example corrector?

8 条评论

mrshoe超过 15 年前

评论 #1087224 未加载

elblanco超过 15 年前

评论 #1087186 未加载

abecedarius超过 15 年前

tom_pinckney超过 15 年前

Careful what dictionary you use...my /usr/share/dict/words has things in it that might not be polite/acceptable to suggest.

llimllib超过 15 年前

<a href="http://en.wikipedia.org/wiki/Patricia_tree" rel="nofollow">http://en.wikipedia.org/wiki/Patricia_tree</a> would be an improvement.

评论 #1087190 未加载

yan超过 15 年前

Do people pronounce trie as 'try' or 'tree'? I pronounce it as 'tree,' which is why it took me a second to catch the pun.

评论 #1087335 未加载

评论 #1087656 未加载

评论 #1087275 未加载

评论 #1087408 未加载

indigoviolet超过 15 年前

kristianp超过 15 年前

What is the algorithmic complexity of your trie code compared to Norvig's example corrector?

How I Trie to Make Spelling Suggestions

8 条评论

How I Trie to Make Spelling Suggestions

8 条评论