> For now it remains experimental because Irregexp is orders of magnitude faster than the new engine on most common patterns<p>I'm sure that an important reason for this is that thousands of man-hours have been invested optimizing Irregexp. However, is that the only explanation?<p>In my experience, while an NFA-based engine has better asymptotic complexity, a backtracking-based engine can be hard to beat for patterns that don't do much backtracking. In a textbook NFA automaton, the time spent maintaining a set of parallel states may be "wasted" if it turns out that the first state in the list is already going to succeed anyway.<p>Regex experts of Hacker News, do you know if there are any engines and/or algorithms that try to get the best of both worlds? Fast execution for patterns that don't need much backtracking, but avoiding exponential run time in the worst case?
> Unfortunately, backreferences, lookahead and lookbehind cannot be supported without major changes that alter asymptotic worst-case complexity.<p>I live and die by regexes, and regexes without backreferences and lookahead would be pretty useless to me.<p>I guess such regexes are useful for _something_, but I am not sure what it would be.
> For the fallback mechanism to kick in, the RegExp must:
> * ...
> * not have the u (Unicode) or i (case insensitive) flags set.<p>I wonder why these flags are significant. Case-insensitivity and unicode support don't seem like they should really affect whatever algorithm they're using.<p>Am I underthinking this, or can you convert /abc/i into /[aA][bB][cC]/ ?
It seems more developer-friendly to me to, at Regex construction, try to parse as linear, and fallback to a backtracking regex if not. The engine is an implementation detail that programmers probably care very little about; a linear engine being default but not raising errors would be the best-of-all-worlds scenario.<p>(It’s a little weird to see the i flag not working with linear regexes, mind. Surely that’s easily implemented when the regex is compiled?)