科技回声

14 条评论

draegtun超过 10 年前

Thought this might be of interest; below shows how the examples provided would look in Rebol:<pre><code> digits: digit: charset "0123456789" rule: [ thru "$" some digits "." digit digit ] parse "$10.00" rule ;; true pattern: [ some "p" 2 "q" any "q" ] new-rule: [ 2 pattern ] parse "pqqpqq" new-rule ;; true </code></pre> Rebol doesn't have regular expressions instead it comes with a parse dialect which is a TDPL - <a href="http://en.wikipedia.org/wiki/Top-down_parsing_language" rel="nofollow">http://en.wikipedia.org/wiki/Top-down_parsing_language</a>Some parse refs: <a href="http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions" rel="nofollow">http://en.wikibooks.org/wiki/REBOL_Programming/Language_Feat...</a> | <a href="http://www.rebol.net/wiki/Parse_Project" rel="nofollow">http://www.rebol.net/wiki/Parse_Project</a> | <a href="http://www.rebol.com/r3/docs/concepts/parsing-summary.html" rel="nofollow">http://www.rebol.com/r3/docs/concepts/parsing-summary.html</a>

评论 #9034970 未加载

评论 #9036141 未加载

评论 #9039680 未加载

tragomaskhalos超过 10 年前

There have been many efforts similar to this in many languages, but most of us seem happy to stick to the more succinct canonical form, supplemented via /x # comments when things get too hairy

marktangotango超过 10 年前

Generally, I find that if one's regexes are so complex that one needs visualizers or other aids in writing them, one doesn't have a regex problem, but a parsing problem. The method of parsing by recursive descent can often lead to much more understandable (if more verbose) "pattern matching".

评论 #9036279 未加载

评论 #9034121 未加载

UnoriginalGuy超过 10 年前

Looks like Linq (from .Net/C#). Pretty sexy way to write Regular Expressions if you ask me.I've "learned" regular expressions multiple times but it just never sticks, I have no idea why. It certainly doesn't help that there are several different incompatible syntaxes (so what I remember and think "should" work doesn't).I'd prefer to write RegX's in this style, however I would pay attention to performance (not that Regular Expressions are high performance, however I wouldn't want to see a large performance loss either).

评论 #9034097 未加载

评论 #9034269 未加载

评论 #9035712 未加载

评论 #9034208 未加载

chris-at超过 10 年前

Thanks, this is a lot better than writing this (even if the formatting worked here):``` (?xi) \b ( # Capture 1: entire matched URL (?: [a-z][\w-]+: # URL protocol and colon (?: /{1,3} # 1-3 slashes | # or [a-z0-9%] # Single letter or digit or '%' # (Trying not to match e.g. "URI::Escape") ) | # or www\d{0,3}[.] # "www.", "www1.", "www2." … "www999." | # or [a-z0-9.\-]+[.][a-z]{2,4}/ # looks like domain name followed by a slash ) (?: # One or more: [^\s()<>]+ # Run of non-space, non-()<> | # or $([^\s()<>]+|(\([^\s()<>]+$))\) # balanced parens, up to 2 levels )+ (?: # End with: $([^\s()<>]+|(\([^\s()<>]+$))\) # balanced parens, up to 2 levels | # or [^\s`!()\[\]{};:'".,<>?«»“”‘’] # not a space or one of these punct chars ) ) ```

评论 #9034128 未加载

评论 #9035379 未加载

评论 #9033619 未加载

jluxenberg超过 10 年前

S-expressions are a natural fit for construction of regular expressions, see <a href="http://community.schemewiki.org/?scheme-faq-programming#H-1w56qpn" rel="nofollow">http://community.schemewiki.org/?scheme-faq-programming#H-1w...</a>e.g.<pre><code> (: (or (in ("az")) (in ("AZ"))) (* (uncase (in ("az09")))))</code></pre>

评论 #9034529 未加载

jgalt212超过 10 年前

Definitely a debugable way to write regexes. Whenever I have to maintain a hairy regex, I like to plot the regex as a railroad diagram.These web based tools can do it:<a href="https://www.debuggex.com/" rel="nofollow">https://www.debuggex.com/</a><a href="http://jex.im/regulex/" rel="nofollow">http://jex.im/regulex/</a>

评论 #9035264 未加载

dkarapetyan超过 10 年前

Generalize just a little bit and you got parser combinators.

zzzcpan超过 10 年前

Regexpes exist to avoid cumbersome code like this, to make it less error prone. Makes me sad to see so many upvotes.I get that some people have a hard time understanding regexpes with all the backtracking and greediness. Yes, syntax is a bit complicated. Maybe simplified predictable default mode could help. But there is no problem with DSL being used as an abstraction. In fact, we need more DSLs, for everything!

psychometry超过 10 年前

Now you have three problems.

kazinator超过 10 年前

Yes, regexes can have other syntactic representations, like:<pre><code> (compound "$" (1+ :digit) "." :digit :digit) </code></pre> Run:<pre><code> $ txr -p "(regex-compile '(compound \"$\" (1+ :digit) \".\" :digit :digit))" #/$\d+\.\d\d/</code></pre>

epicureanideal超过 10 年前

Nice work! I don't know if it'll be ideal for all use cases, but it does add some readability.

otakucode超过 10 年前

Now do an example where you create a regex to parse the IMDB movies.list data file!

gcao超过 10 年前

Great work! This is very intriguing!

14 条评论

draegtun超过 10 年前

评论 #9034970 未加载

评论 #9036141 未加载

评论 #9039680 未加载

tragomaskhalos超过 10 年前

There have been many efforts similar to this in many languages, but most of us seem happy to stick to the more succinct canonical form, supplemented via /x # comments when things get too hairy

marktangotango超过 10 年前

评论 #9036279 未加载

评论 #9034121 未加载

UnoriginalGuy超过 10 年前

评论 #9034097 未加载

评论 #9034269 未加载

评论 #9035712 未加载

评论 #9034208 未加载

chris-at超过 10 年前

评论 #9034128 未加载

评论 #9035379 未加载

评论 #9033619 未加载

jluxenberg超过 10 年前

评论 #9034529 未加载

jgalt212超过 10 年前

评论 #9035264 未加载

dkarapetyan超过 10 年前

Generalize just a little bit and you got parser combinators.

zzzcpan超过 10 年前

psychometry超过 10 年前

Now you have three problems.

kazinator超过 10 年前

epicureanideal超过 10 年前

Nice work! I don't know if it'll be ideal for all use cases, but it does add some readability.

otakucode超过 10 年前

Now do an example where you create a regex to parse the IMDB movies.list data file!

gcao超过 10 年前

Great work! This is very intriguing!

RegExpBuilder – Create regular expressions using chained methods

14 条评论

RegExpBuilder – Create regular expressions using chained methods

14 条评论