TechEcho

10 comments

I’m surprised to see the highlights don’t include another common detail of the parsing algorithm that often trips people up: table rows and cells (tr/th/td) must be in one of thead/tbody/tfoot. If they’re not, they’re implicitly nested into a tbody. As in:<pre><code> <table>  <tr> <th>Column one</th> <th>Column two</th> </th> <tr> <td>Row one col one</td> <td>Row one col two</td> </th>  </table> </code></pre> I’ve frequently seen it cause a variety of issues with VDOM libraries, and even plain DOM libraries with a notion of declarative templates, ranging from hydration mismatch logs (meh) to actual logic errors (corruption of the real DOM when nodes aren’t where they’re expected to be).Other implied/omitted tags like body can cause similar issues too, but I think that’s become a far less common “mistake” (all of these are totally valid since at least HTML5) in recent years.

评论 #32969458 未加载

skybrianover 2 years ago

Perhaps a more intuitive name would be "round-trip serialization HTML". That is, if you use the browser to parse and print some HTML, it matches the source code.Or in other words, it's formatted the same way that the browser would do it. So, you use the browser to pretty-print the HTML page, and save the code as the source. It's not hard at all and could be done automatically.Round-trip tests are often used to check that a deserialization routine outputs data that can be serialized again and no data is lost. It even lets you change the serialization format, provided that you change the parser and printer to match.I expect that these sort of tests are a lot more useful with fuzzing, though. Finding one example that works mostly just tells you that the browser's HTML printing code isn't completely broken. A single test of that sort is only useful for catching stupid bugs quickly.

kazinatorover 2 years ago

This is called print-read consistency in the Lisp world: an object is printed in such a way that the syntax can be read to produce a similar object, or else is given a deliberately unreadable notation like #<...>, where the #< combination is required to produce a read error.<a href="https://stackoverflow.com/questions/70797208/what-is-print-read-consistency" rel="nofollow">https://stackoverflow.com/questions/70797208/what-is-print-r...</a>

评论 #32969528 未加载

PaulStateznyover 2 years ago

> Why write Fixed-Point HTML?> simply the satisfaction of knowing that you and the browser are in total agreementSo, just to clarify: there's no technical benefit, correct?

评论 #32968210 未加载

评论 #32968169 未加载

评论 #32968106 未加载

tomxorover 2 years ago

> the real reason to code in Fixed-Point HTML is simply the satisfaction of knowing that you and the browser are in total agreement about the HTML.Interesting idea, I've been trying to achieve something similar but in reverse... rather than make my source match the browser, make the browser match my source by making it not ignore spacing.i.e The basics being `white-space: pre;` on the body element, and fixed width and sized fonts. But I still want a HTML document so i can opt in to html where it matters. My reasons are to A) avoid a pre-processor and build toolchain complexity, stick to nice simple static files, and B) I get something similar to WYSIWYG but as source code. C) I like fixed width fonts and to plain text formatting (reducing decisions is helpful for focus).

评论 #32968127 未加载

tfshover 2 years ago

Before now I've explicitly reduced the size of my HTML docs (nothing critical/production facing, all passion projects) by removing certain HTML tags (e.g DOCTYPE, closing tags, etc) because I know modern browsers will still render them correctly.This means there are miniscule savings from a bandwidth serving perspective. I wonder what the trade off is between the HTTP call and document parse/paint.E.g is it correct to assume the browser will parse/paint the HTML content - fixing incorrectly closed tags on the fly faster than the few milliseconds more it would take to serve fixed-point HTML from the server?

pushedxover 2 years ago

Interesting concept.On latest Chrome, the "Check Fixed-Point" button appears to fail.

评论 #32968118 未加载

评论 #32968307 未加载

评论 #32968093 未加载

WirelessGigabitover 2 years ago

XML-flavored self-closing elements are banished (use instead of )God I hate that. It just doesn’t make sense. Where is the closed?

评论 #32986389 未加载

bhedgeoserover 2 years ago

Fixed-point check failed on chrome 104.0.5112.101

评论 #32968260 未加载

exabrialover 2 years ago

Basically: xhtml is fast and verifiable?

评论 #32969383 未加载