TechEcho

8 comments

rayvalover 14 years ago

Interesting that the shortest (and among the most readable) is the 15-line version written in old-school AWK: <a href="http://pacman.blog.br/wiki/index.php?title=Um_Corretor_Ortográfico_em_GAWK" rel="nofollow">http://pacman.blog.br/wiki/index.php?title=Um_Corretor_Ortog...</a>A surprise is that the Perl implementation weighs in at 63 lines. I would have expected much less. I expect a much shorter version is possible, relying on idiomatic constructs at the expense of readability.

评论 #2035901 未加载

评论 #2036088 未加载

评论 #2037006 未加载

timrobinsonover 14 years ago

I'm no Python expert, but I liked doing this in Haskell:<a href="http://www.partario.com/blog/2009/10/a-spelling-corrector-in-haskell.html" rel="nofollow">http://www.partario.com/blog/2009/10/a-spelling-corrector-in...</a><a href="https://github.com/timrobinson/spell-correct/blob/master/Correct.hs" rel="nofollow">https://github.com/timrobinson/spell-correct/blob/master/Cor...</a>

评论 #2037674 未加载

clvvover 14 years ago

I have seen this one before somewhere, and what amazes me is that how you can solve problems without a hassle if you get the "trick" right. Another case I read was that Google use(at least used) two vectors(each consists of many 0s and 1s, which in turn represent whether the web page has the keyword or not) to represent web pages, and calculate the angle between the vectors to figure out the similarity(a value) between web pages.

评论 #2036428 未加载

评论 #2035850 未加载

评论 #2035857 未加载

评论 #2036301 未加载

prsover 14 years ago

Scrolling to the bottom of the article gives you a list of similar implementations in other languages.Very interesting to see how other have tackled this programming problem.

评论 #2036375 未加载

tgflynnover 14 years ago

Anyone else having trouble parsing this list comprehension ?<pre><code> e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS</code></pre>

评论 #2035609 未加载

评论 #2035653 未加载

评论 #2036627 未加载

vebover 14 years ago

In PHP, you can use the levenshtein() function.<a href="http://php.net/manual/en/function.levenshtein.php" rel="nofollow">http://php.net/manual/en/function.levenshtein.php</a>

评论 #2036119 未加载

评论 #2035672 未加载

评论 #2035545 未加载

SlyShyover 14 years ago

If anyone is interested in other applications of this technique, this is a Ruby library I wrote to do sentence tokenization: <a href="https://github.com/SlyShy/Tactful_Tokenizer" rel="nofollow">https://github.com/SlyShy/Tactful_Tokenizer</a>

singularover 14 years ago

I found it fascinating that you can actually implement something seemingly so magical in such a small amount of code.I'm down as 22 lines of C#, though to be fair I am cheating vastly and the lines are huge :)<a href="http://www.codegrunt.co.uk/2010/11/02/C-Sharp-Norvig-Spelling-Corrector.html" rel="nofollow">http://www.codegrunt.co.uk/2010/11/02/C-Sharp-Norvig-Spellin...</a>C# does offer some nice features to give you some succinctness but there's no getting away from the verboseness of a java-like language.

评论 #2037141 未加载

8 comments

rayvalover 14 years ago

评论 #2035901 未加载

评论 #2036088 未加载

评论 #2037006 未加载

timrobinsonover 14 years ago

评论 #2037674 未加载

clvvover 14 years ago

评论 #2036428 未加载

评论 #2035850 未加载

评论 #2035857 未加载

评论 #2036301 未加载

prsover 14 years ago

Scrolling to the bottom of the article gives you a list of similar implementations in other languages.Very interesting to see how other have tackled this programming problem.

评论 #2036375 未加载

tgflynnover 14 years ago

Anyone else having trouble parsing this list comprehension ?<pre><code> e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS</code></pre>

评论 #2035609 未加载

评论 #2035653 未加载

评论 #2036627 未加载

vebover 14 years ago

In PHP, you can use the levenshtein() function.<a href="http://php.net/manual/en/function.levenshtein.php" rel="nofollow">http://php.net/manual/en/function.levenshtein.php</a>

Spelling Corrector in 21 lines of Python

8 comments

Spelling Corrector in 21 lines of Python

8 comments