TechEcho

4 comments

ufoover 4 years ago

> For now it remains experimental because Irregexp is orders of magnitude faster than the new engine on most common patternsI'm sure that an important reason for this is that thousands of man-hours have been invested optimizing Irregexp. However, is that the only explanation?In my experience, while an NFA-based engine has better asymptotic complexity, a backtracking-based engine can be hard to beat for patterns that don't do much backtracking. In a textbook NFA automaton, the time spent maintaining a set of parallel states may be "wasted" if it turns out that the first state in the list is already going to succeed anyway.Regex experts of Hacker News, do you know if there are any engines and/or algorithms that try to get the best of both worlds? Fast execution for patterns that don't need much backtracking, but avoiding exponential run time in the worst case?

评论 #25742008 未加载

评论 #25746395 未加载

评论 #25741885 未加载

IlliOnatoover 4 years ago

> Unfortunately, backreferences, lookahead and lookbehind cannot be supported without major changes that alter asymptotic worst-case complexity.I live and die by regexes, and regexes without backreferences and lookahead would be pretty useless to me.I guess such regexes are useful for _something_, but I am not sure what it would be.

评论 #25741436 未加载

评论 #25741414 未加载

评论 #25742110 未加载

recursiveover 4 years ago

> For the fallback mechanism to kick in, the RegExp must: > * ... > * not have the u (Unicode) or i (case insensitive) flags set.I wonder why these flags are significant. Case-insensitivity and unicode support don't seem like they should really affect whatever algorithm they're using.Am I underthinking this, or can you convert /abc/i into /[aA][bB][cC]/ ?

评论 #25741068 未加载

评论 #25742018 未加载

评论 #25746000 未加载

yunruseover 4 years ago

It seems more developer-friendly to me to, at Regex construction, try to parse as linear, and fallback to a backtracking regex if not. The engine is an implementation detail that programmers probably care very little about; a linear engine being default but not raising errors would be the best-of-all-worlds scenario.(It’s a little weird to see the i flag not working with linear regexes, mind. Surely that’s easily implemented when the regex is compiled?)

评论 #25741138 未加载

评论 #25741282 未加载

4 comments

ufoover 4 years ago

评论 #25742008 未加载

评论 #25746395 未加载

评论 #25741885 未加载

IlliOnatoover 4 years ago

评论 #25741436 未加载

评论 #25741414 未加载

评论 #25742110 未加载

recursiveover 4 years ago

评论 #25741068 未加载

评论 #25742018 未加载

评论 #25746000 未加载

yunruseover 4 years ago

评论 #25741138 未加载

评论 #25741282 未加载

An additional non-backtracking RegExp engine

4 comments

An additional non-backtracking RegExp engine

4 comments