This technique is unreliable in practice, and the author's discussion is confused.<p>First, their explanation doesn't make sense. They're supposing that there's some determinacy in the order in which a matcher can be expected to examine the different possible matches. But that's provably not the case: if it were, then deterministic and non-determinsitic finite automata would be inequivalent.<p>But the technique in question does seem to require some determinacy as to which of several alternatives will match against a string. Where does that determinacy come from? The semantics of the alternation operator (the '|') as usually formulated don't specify any preference among alternations. For that reason, POSIX <i>additionally</i> requires that a matcher return the longest possible match (and if there are several such, the leftmost is what must be returned). Where you do find an explicit guarantee concerning which of several different possible ways of matching will be preferred, it's almost certainly because the engine is aiming at POSIX compliance.<p>Such compliance has a significant cost, though, as it requires the matcher to consider <i>all</i> possible matches (in order to find the largest). For that reason, most regex engines forego strict POSIX compliance and only guarantee that some match will be returned if one exists, not that that match will be the leftmost longest. Some engines offer the option of requesting strict POSIX behavior, but the default will always be to eagerly return the first match encountered (and recall the point above that there provably can't be a guarantee about the order in which matches are encountered, in general).<p>You should never do this in production code unless you're sure that your matcher is POSIX-compliant.