TechEcho

1 comment

raiphabout 8 years ago

I've always loved that post but the truth is that while you can <i>not</i> parse [X]HTML with what I'll call a "regular expression", by which I mean the formal CS definition [1], you can with a suitable "regex", by which I mean what most folk mean by the term "regex".[2]<p>[1] <a href="https://en.wikipedia.org/wiki/Regular_expression#Formal_definition" rel="nofollow">https://en.wikipedia.org/wiki/Regular_expression#Formal_defi...</a><p>[2] PCRE engines support recursive matching etc. but perhaps the most illuminating example is regex in Perl 6 such as this JSON grammar (a Perl 6 grammar is a class containing Perl 6 named regexes): <a href="https://github.com/moritz/json/blob/master/lib/JSON/Tiny/Grammar.pm" rel="nofollow">https://github.com/moritz/json/blob/master/lib/JSON/Tiny/Gra...</a>

You can't parse [X]HTML with regex

1 comment

You can't parse [X]HTML with regex

1 comment