TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

RegExpBuilder – Create regular expressions using chained methods

85 点作者 jrullmann超过 10 年前

14 条评论

draegtun超过 10 年前
Thought this might be of interest; below shows how the examples provided would look in Rebol:<p><pre><code> digits: digit: charset &quot;0123456789&quot; rule: [ thru &quot;$&quot; some digits &quot;.&quot; digit digit ] parse &quot;$10.00&quot; rule ;; true pattern: [ some &quot;p&quot; 2 &quot;q&quot; any &quot;q&quot; ] new-rule: [ 2 pattern ] parse &quot;pqqpqq&quot; new-rule ;; true </code></pre> Rebol doesn&#x27;t have regular expressions instead it comes with a parse dialect which is a TDPL - <a href="http://en.wikipedia.org/wiki/Top-down_parsing_language" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Top-down_parsing_language</a><p>Some <i>parse</i> refs: <a href="http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions" rel="nofollow">http:&#x2F;&#x2F;en.wikibooks.org&#x2F;wiki&#x2F;REBOL_Programming&#x2F;Language_Feat...</a> | <a href="http://www.rebol.net/wiki/Parse_Project" rel="nofollow">http:&#x2F;&#x2F;www.rebol.net&#x2F;wiki&#x2F;Parse_Project</a> | <a href="http://www.rebol.com/r3/docs/concepts/parsing-summary.html" rel="nofollow">http:&#x2F;&#x2F;www.rebol.com&#x2F;r3&#x2F;docs&#x2F;concepts&#x2F;parsing-summary.html</a>
评论 #9034970 未加载
评论 #9036141 未加载
评论 #9039680 未加载
tragomaskhalos超过 10 年前
There have been many efforts similar to this in many languages, but most of us seem happy to stick to the more succinct canonical form, supplemented via &#x2F;x # comments when things get too hairy
marktangotango超过 10 年前
Generally, I find that if one&#x27;s regexes are so complex that one needs visualizers or other aids in writing them, one doesn&#x27;t have a regex problem, but a parsing problem. The method of parsing by recursive descent can often lead to much more understandable (if more verbose) &quot;pattern matching&quot;.
评论 #9036279 未加载
评论 #9034121 未加载
UnoriginalGuy超过 10 年前
Looks like Linq (from .Net&#x2F;C#). Pretty sexy way to write Regular Expressions if you ask me.<p>I&#x27;ve &quot;learned&quot; regular expressions multiple times but it just never sticks, I have no idea why. It certainly doesn&#x27;t help that there are several different incompatible syntaxes (so what I remember and think &quot;should&quot; work doesn&#x27;t).<p>I&#x27;d prefer to write RegX&#x27;s in this style, however I would pay attention to performance (not that Regular Expressions are high performance, however I wouldn&#x27;t want to see a large performance loss either).
评论 #9034097 未加载
评论 #9034269 未加载
评论 #9035712 未加载
评论 #9034208 未加载
chris-at超过 10 年前
Thanks, this is a lot better than writing this (even if the formatting worked here):<p>``` (?xi) \b ( # Capture 1: entire matched URL (?: [a-z][\w-]+: # URL protocol and colon (?: &#x2F;{1,3} # 1-3 slashes | # or [a-z0-9%] # Single letter or digit or &#x27;%&#x27; # (Trying not to match e.g. &quot;URI::Escape&quot;) ) | # or www\d{0,3}[.] # &quot;www.&quot;, &quot;www1.&quot;, &quot;www2.&quot; … &quot;www999.&quot; | # or [a-z0-9.\-]+[.][a-z]{2,4}&#x2F; # looks like domain name followed by a slash ) (?: # One or more: [^\s()&lt;&gt;]+ # Run of non-space, non-()&lt;&gt; | # or \(([^\s()&lt;&gt;]+|(\([^\s()&lt;&gt;]+\)))<i>\) # balanced parens, up to 2 levels )+ (?: # End with: \(([^\s()&lt;&gt;]+|(\([^\s()&lt;&gt;]+\)))</i>\) # balanced parens, up to 2 levels | # or [^\s`!()\[\]{};:&#x27;&quot;.,&lt;&gt;?«»“”‘’] # not a space or one of these punct chars ) ) ```
评论 #9034128 未加载
评论 #9035379 未加载
评论 #9033619 未加载
jluxenberg超过 10 年前
S-expressions are a natural fit for construction of regular expressions, see <a href="http://community.schemewiki.org/?scheme-faq-programming#H-1w56qpn" rel="nofollow">http:&#x2F;&#x2F;community.schemewiki.org&#x2F;?scheme-faq-programming#H-1w...</a><p>e.g.<p><pre><code> (: (or (in (&quot;az&quot;)) (in (&quot;AZ&quot;))) (* (uncase (in (&quot;az09&quot;)))))</code></pre>
评论 #9034529 未加载
jgalt212超过 10 年前
Definitely a debugable way to write regexes. Whenever I have to maintain a hairy regex, I like to plot the regex as a railroad diagram.<p>These web based tools can do it:<p><a href="https://www.debuggex.com/" rel="nofollow">https:&#x2F;&#x2F;www.debuggex.com&#x2F;</a><p><a href="http://jex.im/regulex/" rel="nofollow">http:&#x2F;&#x2F;jex.im&#x2F;regulex&#x2F;</a>
评论 #9035264 未加载
dkarapetyan超过 10 年前
Generalize just a little bit and you got parser combinators.
zzzcpan超过 10 年前
Regexpes exist to avoid cumbersome code like this, to make it less error prone. Makes me sad to see so many upvotes.<p>I get that some people have a hard time understanding regexpes with all the backtracking and greediness. Yes, syntax is a bit complicated. Maybe simplified predictable default mode could help. But there is no problem with DSL being used as an abstraction. In fact, we need more DSLs, for everything!
psychometry超过 10 年前
Now you have three problems.
kazinator超过 10 年前
Yes, regexes can have other syntactic representations, like:<p><pre><code> (compound &quot;$&quot; (1+ :digit) &quot;.&quot; :digit :digit) </code></pre> Run:<p><pre><code> $ txr -p &quot;(regex-compile &#x27;(compound \&quot;$\&quot; (1+ :digit) \&quot;.\&quot; :digit :digit))&quot; #&#x2F;$\d+\.\d\d&#x2F;</code></pre>
epicureanideal超过 10 年前
Nice work! I don&#x27;t know if it&#x27;ll be ideal for all use cases, but it does add some readability.
otakucode超过 10 年前
Now do an example where you create a regex to parse the IMDB movies.list data file!
gcao超过 10 年前
Great work! This is very intriguing!