TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Haskell version of Norvig's spelling corrector

73 点作者 marcosero超过 10 年前

6 条评论

quchen超过 10 年前
&gt; I wrote this code putting brevity over readability, which is something I usually never do<p>Shouldn&#x27;t the point of such a post be to show interesting code? I&#x27;m having trouble reading through the densely packed source.<p>In addition to tromp&#x27;s minor nitpick, I have several major ones.<p>- the code is full of redundant parentheses. HLint can detect those (and many other style errors) automatically. LPaste has HLint installed so you have a linting pastebin available online. <a href="http://lpaste.net/116871" rel="nofollow">http:&#x2F;&#x2F;lpaste.net&#x2F;116871</a><p>- A lot of the functions are written in a non-idiomatic way. &quot;m &gt;&gt;= return . f&quot; is &quot;fmap&quot;, &quot;(.)&quot; can combine functions much more readable than Lisp stacks of parentheses.<p>- ByteString.Char8 is usually a wrong choice, more on that here: <a href="https://github.com/quchen/articles/blob/master/fbut.md#bytestringchar8-is-bad" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;quchen&#x2F;articles&#x2F;blob&#x2F;master&#x2F;fbut.md#bytes...</a><p>- If you count to &quot;length x&quot; then often there&#x27;s a more elegant solution that avoids calculating the length altogether. For example &quot;splits xs = zip (inits xs) (tails xs)&quot;.<p>- Brevity is never better than readability.<p>- No top-level definitions should lack a type signature. GHC even has warnings for that (I think they start firing with -W).<p>- A function should do one thing and then be composed with other functions. &quot;lowerWords&quot; converts to words and then maps them all to lower case, for example. These are two completely different operations in one long line.<p>- In order of increasing generality: foldr union empty = unions = mconcat = fold<p>- Use pattern matching, avoid &quot;(!!)&quot;. transposes w = [ a ++ [b0,b1] ++ bs | (a, b0:b1:bs) &lt;- splits w] - also see <a href="https://github.com/quchen/articles/blob/master/fbut.md#head-tail-isjust-isnothing-fromjust-" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;quchen&#x2F;articles&#x2F;blob&#x2F;master&#x2F;fbut.md#head-...</a><p>- For large amounts of words that you split and concatenate again, String is probably not the right type. Text is good for dealing with such things.<p>- replaces w = [as ++ [c] ++ bs | (as, _:bs) &lt;- splits w , c &lt;- alphabet]<p>... and so on.
评论 #8773746 未加载
评论 #8773719 未加载
tromp超过 10 年前
Minor nitpick: the first real line of code<p><pre><code> alphabet = &quot;abcdefghijklmnopqrstuvwxyz&quot; </code></pre> is better written as<p><pre><code> alphabet = [&#x27;a&#x27;..&#x27;z&#x27;] </code></pre> This is really syntactic sugar for<p><pre><code> enumFromTo &#x27;a&#x27; &#x27;z&#x27; </code></pre> using the function<p><pre><code> enumFromTo :: Enum a =&gt; a -&gt; a -&gt; [a] </code></pre> from the typeclass Enum for enumerable types, and the fact that a string (type String) is just a list of characters (type [Char]).
评论 #8773483 未加载
flaie超过 10 年前
Interesting read for a Haskell newcomer like me!<p>Regarding the original webpage of Norvig&#x27;s spelling corrector, I think it is not up to date as I remember browsing the web and finding some shorter versions in other languages.<p>I&#x27;ve shortened the Python version to 14&#x2F;15 lines using some features of Python3.
wyager超过 10 年前
Cool! Since we&#x27;re suggesting changes, here&#x27;s what I&#x27;d do. (Not that anything is wrong with the OP&#x27;s code, just that it&#x27;s good to point out all the different stylistic techniques you can adopt.)<p><pre><code> 7. alphabet = [&#x27;a&#x27;..&#x27;z&#x27;] 8. nWords = B.readFile &quot;big.txt&quot; &gt;&gt;= return . train . lowerWords . B.unpack </code></pre> or:<p><pre><code> 8. nWords = train . lowerWords . B.unpack &lt;$&gt; B.readFile &quot;big.txt&quot; </code></pre> Make `splits`, `deletes`, etc. values (not functions). `splits` has access to `w`, so there&#x27;s no need to pass it as an argument 4 times (or even to pass `w` as an argument to the other functions).<p><pre><code> 27. sortCandidates = (sortBy (flip (comparing snd))) . M.toList</code></pre>
评论 #8774615 未加载
bshimmin超过 10 年前
Not to make any particular point, but mainly just because I fancied a bit of procrastination this afternoon, here&#x27;s a CoffeeScript version (heavily leaning on Underscore): <a href="https://gist.github.com/benshimmin/2ee78c932797faadfc89" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;benshimmin&#x2F;2ee78c932797faadfc89</a>
dschiptsov超过 10 年前
Which &quot;proves&quot; again that programming is neither about OO nor about purity..)
评论 #8774634 未加载