TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Regular Expression Matching Can Be Simple and Fast (2007)

31 点作者 humanarity大约 10 年前

5 条评论

tikhonj大约 10 年前
…if your regexes are, you know, <i>regular</i>. Which Perl-style regular expressions most definitely are not. So it&#x27;s not a matter of just implementing the regexes you know faster, but limiting them to a subset of features first.<p>This is a perfectly valid thing to do, but it <i>is</i> changing the power of regexes significantly. I feel the article makes this fact easy to overlook—I certainly did until I took a class on automata.<p>But yes, presumably Perl could use an NFA-based algorithm if given a regex that is actually regular. One of my professors attributed the fact that it doesn&#x27;t to old patent issues, but I have not been able to find a good source for that. It&#x27;s also possible that it&#x27;s just a case of worse is better: backtracking is fast enough 99% of the time, and if performance really matters, perhaps you shouldn&#x27;t be using regexes anyhow.
评论 #9378566 未加载
评论 #9378690 未加载
评论 #9378074 未加载
pcwalton大约 10 年前
To be honest, I&#x27;ve never been much of a fan of this article, because it neglects the fastest way to implement regexes <i>in practice</i>: use a fast JIT to emit code that performs recursive backtracking. The Thompson NFA is a better algorithm for pathological regexes that people don&#x27;t frequently write (e.g. &quot;(a?)﹡a﹡)&quot;), but the constant factors are much worse for common cases.<p>The Thompson NFA has its place as a fallback to avoid exponential blowup on pathological regexes, but if I were implementing a performance-critical regex engine, I would start with a JIT for the recursive backtracking approach, because in practice constant factors dominate over algorithmic worst-time bounds, and only add the Thompson NFA later.
评论 #9378699 未加载
humanarity大约 10 年前
And Ken Thompson&#x27;s original 1968 paper,<p><a href="http:&#x2F;&#x2F;www.fing.edu.uy&#x2F;inco&#x2F;cursos&#x2F;intropln&#x2F;material&#x2F;p419-thompson.pdf" rel="nofollow">http:&#x2F;&#x2F;www.fing.edu.uy&#x2F;inco&#x2F;cursos&#x2F;intropln&#x2F;material&#x2F;p419-th...</a>
aioprisan大约 10 年前
Any good web-based regex tools that you use daily?
评论 #9378129 未加载
评论 #9383532 未加载
评论 #9378137 未加载
评论 #9388293 未加载
评论 #9379517 未加载
smegel大约 10 年前
And when was the last time anyone used grouping in a regex!
评论 #9378082 未加载
评论 #9377754 未加载