Repeating, mutatis mutandis, what I said in an earlier thread (<a href="http://news.ycombinator.com/item?id=2723366" rel="nofollow">http://news.ycombinator.com/item?id=2723366</a>) about the C code on which this is based:<p>It's very nice pattern-matching code, but it isn't a regular expression matcher. The class of patterns it's able to match is much smaller than the class of all regular expressions -- not because of missing "sugar" (character classes, +, non-greedy wildcards, ...) but because it doesn't support groups or the | operator.<p>As a result, the language supported by this matcher has, e.g., no way to match the pattern (ab)* or the pattern a|b. It's much, much less powerful than an actual regular expression matcher would be, and much of what makes it possible to do it in 19 lines of Clojure is that loss of power.<p>(I've written extremely similar code before: this level of functionality -- basically, glob or DOS wildcards -- is pretty useful. I'm not Rob Pike, and my code for similar functionality would probably be longer than his. But my code, or even Rob Pike's code, for even the simplest thing that could honestly be called a regular expression matcher, would be longer than this by a bigger factor.)
I'm no haskell hacker (by a long, long, long shot[0]), and I've played a lot more with Clojure than with Haskell, but I still found this easier to write and read in haskell than in clojure.<p><pre><code> matches :: String -> String -> Bool
matches rs = dm ((fixHead . fixTail) rs) where
fixHead ('^':rs) = rs
fixHead rs = '.':'*':rs
fixTail rs = case (reverse rs) of
('$':rrs) -> reverse rrs
rrs -> reverse ('*':'.':rrs)
dm "" "" = True
dm ('.':'*':rs) (c:cs) = (dm ('.':'*':rs) cs) || (dm rs (c:cs))
dm (r:'*':rs) (c:cs) = (dm rs (c:cs)) || (c == r && (dm (r:'*':rs) cs))
dm ('.':rs) (c:cs) = dm rs cs
dm (r:rs) (c:cs) = r == c && dm rs cs
dm (_:'*':[]) "" = True
dm _ _ = False
</code></pre>
[0] I'm particularly ashamed of fixHead and fixTail and the way I special cased matching a star and nothing else against the empty string<p>edit: couldn't resist doing the golf thing<p><pre><code> matches :: String -> String -> Bool
matches rs = dm (fixHead rs) where
fixHead ('^':rs) = rs
fixHead rs = '.':'*':rs
cm r c = r == '.' || r == c
dm (r:'*':rs) (c:cs) = (dm rs (c:cs)) || ((cm r c) && (dm (r:'*':rs) cs))
dm (_:'*':"") "" = True
dm (r:rs) (c:cs) = cm r c && dm rs cs
dm "" _ = True
dm "$" "" = True
dm _ _ = False</code></pre>
LOC (Lines of Code) is not a good characterization of the quality of code; clarity and understandability is. The Clojure implementation is pretty nice on those counts.<p>The LOC measure may be important when, for example, the goal is to show the number of operations to be as small as possible.
I so understand this <a href="http://www.cs.princeton.edu/courses/archive/spr09/cos333/beautiful.html" rel="nofollow">http://www.cs.princeton.edu/courses/archive/spr09/cos333/bea...</a> way better than clojure. What matters? number of lines or readability? and I am sure the Clojure implementation would be better understood by people who understand lisp. Ofcourse I haven't played with clojure or any lisp dialects and the main reason for that is when I see a lisp code I see only "(" and ")". I still don't know how to get out of that.
Both versions are really nice but I find the C version to be lots easier to understand (I come from a C background, so that's probably not a surprise).<p>What does this look like to an experienced lisp/clojure programmer, do they have the same feeling but reversed?<p>What about programmers familiar with other languages?