WHATWG's procedural HTML parsing spec sucks big time indeed. Ian Hickson, as the original author, once derived it from how SGML parses HTML, with empty elements and inferring omitted end-tags when encountering block-level markup within span-level markup and all, modulo the commenting oddities introduced to stop browsers rendering CSS and JS as content, and also handling attribute short forms. Once precisely capturing SGML, the spec wasn't updated consistently as new elements were introduced, precisely because of its presentation as explicit redundant enumeration of elements forcing certain others to close or open seemingly at random. I have absolutely no idea how one can be motivated to implement HTML parsing from WHATWG's description, or how one can seriously see it as an improvement over SGML.