TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

XML is almost always misused

201 pointsby mrzoolover 5 years ago

31 comments

alkonautover 5 years ago
I’m going to just argue the exact opposite of the article: xml and json are both structured data formats useful for tree like data graphs, such as objects.<p>Whether that was the intended purpose when xml was designed is irrelevant. It’s what xml is used for in almost every case.<p>The author also doesn’t suggest what should be used instead to encode structured data, or perhaps more importantly what <i>should</i> have been used to encode graph like things such as map&#x2F;lists&#x2F;objects in the 2000’s. Json really hasn’t been an alternative until quite recently (10 years ago?).<p>In fact reading the article carefully I fail to see the author argue <i>why</i> xml shouldn’t be used as a data format either.
评论 #21392016 未加载
评论 #21392037 未加载
评论 #21391818 未加载
评论 #21393705 未加载
评论 #21391939 未加载
评论 #21392486 未加载
kbensonover 5 years ago
So, what <i>is</i> XML good for? If it&#x27;s not good for data as everyone says (and I&#x27;m not inclined to argue), but it is good for documents, what kind of documents are we referring to? A defined metadata on a text document? A template used with data to generate something else? Is a configuration file a document or data? Where would I want to use XML that something like JSON, a text document, or some combination thereof wouldn&#x27;t be better?<p>I&#x27;m not being facetious, this is an honest question. Where are the &quot;right&quot; places to use XML?
评论 #21391765 未加载
评论 #21391682 未加载
评论 #21391959 未加载
评论 #21391873 未加载
评论 #21392026 未加载
评论 #21391697 未加载
评论 #21391938 未加载
评论 #21391702 未加载
评论 #21395939 未加载
mojubaover 5 years ago
That&#x27;s a very useful and sobering view actually. Unfortunately for the XML format it wasn&#x27;t designed to prevent its own abuse. But anyway, <i>XML is for documents</i> sounds like a good and acceptable paradigm.<p>That being said, one subtle and important (and often overlooked) difference between XML and, say JSON is that you can stream XML while parsing it on the application level, whereas JSON can not be parsed by the application due to arbitrary ordering of keys. (Of course lower level parsers use streaming anyway, but that&#x27;s not the point)<p>In fact you not only can but you should parse XML while streaming it. This is another common abuse: wherever you look you see some high level function that loads an entire parsed XML structure into memory at once. But once you start asking yourself where the file may be coming from you realize that your system may be open to denial of service attacks. E.g. is your system ready to receive a 16GB XML file?
评论 #21392536 未加载
评论 #21392302 未加载
unilynxover 5 years ago
XML is a actually wonderful format for data and especially extensible configuration if you combine it with XML Schema and CSS selectors...<p>- XML schemas give you a ready-to-use format to describe, restrict and document available configuration settings. The unique keys help and a libxml2 gives ready to use validation, even if you may need to &#x27;translate&#x27; its error messages before showing them to end users<p>- XML schemas also support other annotations so you can further generalise your configuration readers by recording the necessary bindings in the XML schema itself, allowing to use it eg to define application user interfaces.<p>- Almost any text editor can do basic syntax validation preventing most typographic errors, and even better if they can read the schema<p>- XML schemas are extensible using &lt;import&gt;s, but namespaces still enforce some separation. You can define explicit points where plugins extend your configuration format using &lt;any&gt;<p>- Human editable - closing tags are noisy but more readable than }],{}] when non-programmers may have to edit these files just to add a few extra textfields to an UI.<p>- Better datatype support, eg datetimes, by using XML schemas. JSON&#x27;s type support is too limited<p>- Support for comments!<p>- And once you&#x27;ve verified the schema... CSS selectors and DOM APIs to actually process the XML documents.<p>YAML fixed quite a few things, but still no date times or as far as I know standardised approaches to defining schemas. And I&#x27;ve lost count at how any attempts exist to add schema information or namespacing to JSON...<p>But for markup... we may be better off to just use markdown inside CDATA blocks
评论 #21392202 未加载
评论 #21392370 未加载
评论 #21392760 未加载
zippergzover 5 years ago
If a piece of technology is &quot;almost always&quot; misused, is that the fault of the users, or the technology?
评论 #21392059 未加载
评论 #21391688 未加载
评论 #21392503 未加载
评论 #21391645 未加载
CobrastanJorjiover 5 years ago
&gt; Here are some very frequently occurring examples of bad schema design:<p>(4 lines, 75 characters)<p>&gt; Here&#x27;s the right way:<p>(10 lines, 133 characters)<p>I have a suspicion as to what went wrong.
评论 #21391580 未加载
评论 #21391846 未加载
theamkover 5 years ago
One thing this misses in the &quot;dictionary&quot; example is that tools (like xpath) push you towards &quot;key in attribute&quot; selection. One of the most common operations we do with dictionaries is lookup by known key, and storing the key in attributes makes it much easier.
评论 #21420064 未加载
altmindover 5 years ago
Can I have my gripe with apple plists?<p>&lt;key&gt;CFBundleDisplayName&lt;&#x2F;key&gt;<p>&lt;string&gt;TextEdit&lt;&#x2F;string&gt;<p>&lt;key&gt;NSHumanReadableCopyright&lt;&#x2F;key&gt;<p>&lt;string&gt;Copyright 2019&lt;&#x2F;string&gt;<p>This may be perfectly parsable by a SAX parser storing some state, but its totally not processable by xslt.
评论 #21391924 未加载
评论 #21391751 未加载
评论 #21420135 未加载
评论 #21394164 未加载
robofanaticover 5 years ago
I guess the correct answer depends upon the requirement.<p>I like this way<p><pre><code> &lt;root&gt; &lt;item key=&quot;name&quot;&gt;John&lt;&#x2F;item&gt; &lt;item key=&quot;city&quot;&gt;London&lt;&#x2F;item&gt; &lt;&#x2F;root&gt; </code></pre> So I can use this xpath to get the person&#x27;s name:<p><pre><code> &#x2F;&#x2F;root&#x2F;item[@key=&quot;name&quot;]&#x2F;text() </code></pre> Not sure what would be the xpath to get the name if the XML was<p><pre><code> &lt;root&gt; &lt;item&gt; &lt;key&gt;Name&lt;&#x2F;key&gt; &lt;value&gt;John&lt;&#x2F;value&gt; &lt;&#x2F;item&gt; &lt;item&gt; &lt;key&gt;City&lt;&#x2F;key&gt; &lt;value&gt;London&lt;&#x2F;value&gt; &lt;&#x2F;item&gt; &lt;&#x2F;root&gt; </code></pre> This is a better example:<p><pre><code> &lt;employees&gt; &lt;employee id=&quot;1&quot;&gt; &lt;field name=&quot;name&quot;&gt;John&lt;&#x2F;field&gt; &lt;field name=&quot;city&quot;&gt;London&lt;&#x2F;field&gt; &lt;&#x2F;employee&gt; &lt;employee id=&quot;2&quot;&gt; &lt;field name=&quot;name&quot;&gt;Jack&lt;&#x2F;field&gt; &lt;field name=&quot;city&quot;&gt;Boston&lt;&#x2F;field&gt; &lt;&#x2F;employee&gt; &lt;employees&gt;</code></pre>
评论 #21392439 未加载
评论 #21392005 未加载
评论 #21392465 未加载
评论 #21392377 未加载
mickduprezover 5 years ago
XML is&#x2F;can be much more than a markup language and yes, it can be used very badly but this is usually by inexperienced &#x27;data wranglers&#x27; who don&#x27;t understand the difference between data attributes and data proper.<p>While XML can seem cumbersome (compared to JSON say) it is a very good &#x27;data transport&#x27; tool when used correctly with a sensible schema (XSD).<p>For example, we use XML as a &#x27;vendor neutral&#x27; data format to export&#x2F;import CAD geometry and associated data for town utilities such as buildings, pipes, roads etc. All this data has to be validated against the schema to ensure its correctness. Using a schema like this enables the city council to import this XML into the GIS system to be used for asset management, financial planning etc.<p>A good schema can be key to sharing XML effectively between departments&#x2F;applications and being a markup language this data can also be viewed independently using XLST.
just_mylesover 5 years ago
I agree with this portion 100%<p>The correct way to express a dictionary in XML is something like this:<p>&lt;root&gt; &lt;item&gt; &lt;key&gt;Name&lt;&#x2F;key&gt; &lt;value&gt;John&lt;&#x2F;value&gt; &lt;&#x2F;item&gt; &lt;item&gt; &lt;key&gt;City&lt;&#x2F;key&gt; &lt;value&gt;London&lt;&#x2F;value&gt; &lt;&#x2F;item&gt; &lt;&#x2F;root&gt;<p>In the past I used to create scripts that exported xml from relational data but didn&#x27;t really understand the right way to build and structure them.
sosukeover 5 years ago
The larger your XML file is the more accurately you&#x27;re using it. Less &quot; and more &lt;&gt;. I made these mistakes, using XML like a I was writing an HTML doc.
tehjokerover 5 years ago
It is difficult for me to see what the real issue is with examples given. It seems to be more an aesthetic preference of the author rather than a technical argument. People can use formats for whatever they want. :P<p>If you told me that the transmission and parsing rate is too slow for their application, that&#x27;s a real dig at it.
评论 #21392017 未加载
LameRubberDuckyover 5 years ago
For those wondering what you do with XML as a document markup language, see the XML document that is the specification for XML. I had to look at the page source to determine it really is an XML document. Looks like an HTML document.<p><a href="https:&#x2F;&#x2F;www.w3.org&#x2F;TR&#x2F;xml&#x2F;REC-xml-20081126.xml" rel="nofollow">https:&#x2F;&#x2F;www.w3.org&#x2F;TR&#x2F;xml&#x2F;REC-xml-20081126.xml</a>
评论 #21420178 未加载
评论 #21393746 未加载
deanCommieover 5 years ago
I think that Software Engineers should take influence from Authors (after all, are we not all craftsmen&#x2F;artisans?) and incorporate the philosophies from <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;The_Death_of_the_Author" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;The_Death_of_the_Author</a><p>The idea, for those not familiar, is that once a work of art is published (a novel, a poem, a song, a painting), it speaks for itself, and authorial intent no longer matters.<p>That is, meaning and purpose are in the eye of the beholder&#x2F;consumer. And there is no right or wrong way to &quot;interpret&quot; art. If someone finds meaning that the author did not intend, it is just as valid as a deeply hidden but intentional allegory they intentionally placed in when they were writing.<p>The relevance to software is it applies to APIs, specifications, standards and formats.<p>There is no such thing as users using your software or specification &quot;wrong&quot; - if they insist on doing so, the meaning has evolved. Evolve with it or die.
评论 #21392039 未加载
tannhaeuserover 5 years ago
&gt; <i>In 1996, XML was invented.</i><p>XML wasn&#x27;t an original invention; it is specified as a proper SGML subset. From the XML spec:<p>&gt; <i>The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document.</i><p>Now I totally agree that SGML and XML aren&#x27;t for service payloads and config files. The sole purpose of markup languages is representing <i>structured text</i>. And arguably, SGML fills this role much more adequately than XML today as it can represent (via the SHORTREF mechanism) custom Wiki syntaxes such as markdown and others, and in contrast to XML, can deal with the largest corpus of markup out there eg. can parse HTML with all its minimization features such a omitted tags, enumerated and unquoted attributes, etc. See [1] for a practical introduction (disclaimer: link to a tutorial I held last month at ACM DocEng).<p>[1]: <a href="http:&#x2F;&#x2F;sgmljs.net&#x2F;docs&#x2F;sgml-html-tutorial.html" rel="nofollow">http:&#x2F;&#x2F;sgmljs.net&#x2F;docs&#x2F;sgml-html-tutorial.html</a>
评论 #21392519 未加载
foolfoolzover 5 years ago
turns out a well specified format that has a lot of parsers available is useful for more than just a markup language. xml is great at data formatting, a little more verbose than alternatives but also a lot more feature rich
评论 #21391671 未加载
h2odragonover 5 years ago
I present, in the spirit of &#x27;worst XML ever&#x27;, the docs for ScriptXML:<p><a href="https:&#x2F;&#x2F;www.egosoft.com:8444&#x2F;confluence&#x2F;display&#x2F;XRWIKI&#x2F;Mission+Director+Guide" rel="nofollow">https:&#x2F;&#x2F;www.egosoft.com:8444&#x2F;confluence&#x2F;display&#x2F;XRWIKI&#x2F;Missi...</a><p>This is the language used in the game X4 Foundations (and others in the series). An example of its use (mine, i claim no grace in it):<p><a href="https:&#x2F;&#x2F;github.com&#x2F;h2odragon&#x2F;dragoncommands&#x2F;blob&#x2F;master&#x2F;aiscripts&#x2F;deployglobe.xml" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;h2odragon&#x2F;dragoncommands&#x2F;blob&#x2F;master&#x2F;aisc...</a><p>... XML is not a great format for an extension language, I have to say.
评论 #21392534 未加载
bullenover 5 years ago
I would love to know what people think of my XML node-graph&#x2F;tree editor I made before JSON became mainstream (my excuse): <a href="http:&#x2F;&#x2F;rupy.se&#x2F;logic.jar" rel="nofollow">http:&#x2F;&#x2F;rupy.se&#x2F;logic.jar</a><p>It basically names the tag what you name the node. :S<p>- You link&#x2F;unlink nodes (I called them entities! Xo) by right-click-dragging between them.<p>- You copy stuff by right-click-dragging to an empty space.<p>- You delete by grabbing something by left-click-holding and pressing the delete key.<p>- Oh, and nodes are completely tree structure expandable, just drag-drop attributes on nodes and nodes inside nodes.<p>The editor uses lightweight rendering so you can have a ton of elements with good performance.<p>(I know, not super intuitive; but very handy once you know about these.)
commandlinefanover 5 years ago
The first rule of XML is: whatever you&#x27;re doing with it, that&#x27;s not what it was for.
hashberryover 5 years ago
&gt; a simple test for determining if an XML schema is well designed: remove all tags and attributes from it ... If what you have left over does not make sense ... you shouldn&#x27;t be using XML at all.<p>Magento 2 (acquired by Adobe for $1.68bn) uses XML to render its layouts. Here&#x27;s some fun XML for the checkout page:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;magento&#x2F;magento2&#x2F;blob&#x2F;2.3-develop&#x2F;app&#x2F;code&#x2F;Magento&#x2F;Checkout&#x2F;view&#x2F;frontend&#x2F;layout&#x2F;checkout_index_index.xml" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;magento&#x2F;magento2&#x2F;blob&#x2F;2.3-develop&#x2F;app&#x2F;cod...</a>
Quarrelsomeover 5 years ago
I once had to write a data layer in xml, in-situ with a lifespan for up to hours, as more data was appended to it. An invalid xml document that you couldn&#x27;t load in many xml apis for 99.9% of its lifespan. I begged and pleaded the lead architect to use an sqlite db for the elements of the data until the transaction was complete and then merely produce the xml file at the end, but no.<p>It had to survive power outs too.
benibelaover 5 years ago
The worst XML use I have ever seen are lists generated by Lazarus. Every list. For example in the project files you have:<p><pre><code> &lt;RequiredPackages Count=&quot;5&quot;&gt; &lt;Item1&gt; &lt;PackageName Value=&quot;LazUtils&quot;&#x2F;&gt; &lt;&#x2F;Item1&gt; &lt;Item2&gt; &lt;PackageName Value=&quot;treelistviewpackage&quot;&#x2F;&gt; &lt;&#x2F;Item2&gt; &lt;Item3&gt; &lt;PackageName Value=&quot;internettools&quot;&#x2F;&gt; &lt;&#x2F;Item3&gt; &lt;Item4&gt; &lt;PackageName Value=&quot;LCLBase&quot;&#x2F;&gt; &lt;MinVersion Major=&quot;1&quot; Release=&quot;1&quot; Valid=&quot;True&quot;&#x2F;&gt; &lt;&#x2F;Item4&gt; &lt;Item5&gt; &lt;PackageName Value=&quot;LCL&quot;&#x2F;&gt; &lt;&#x2F;Item5&gt; &lt;&#x2F;RequiredPackages&gt;</code></pre>
评论 #21392649 未加载
评论 #21392058 未加载
mpweiherover 5 years ago
Hmm...somewhat disagree with the &quot;correct&quot; way to express a dictionary. I prefer:<p><pre><code> &lt;root&gt; &lt;Name&gt;John&lt;&#x2F;Name&gt; &lt;City&gt;London&lt;&#x2F;City&gt; &lt;&#x2F;root&gt; </code></pre> Removes one level of indirection, XML already has keys.
评论 #21391676 未加载
评论 #21391611 未加载
评论 #21391648 未加载
评论 #21391627 未加载
评论 #21391698 未加载
评论 #21391795 未加载
评论 #21391642 未加载
scaryglidersover 5 years ago
I have always thought that using XML as a format for storing and retrieving configuration files, was complete insanity.<p>Which is why I still use the simple, effective, INI format for configuration files for applications I write.<p>XML for config files is madness personified.
billsixover 5 years ago
RIP Eric Naggum<p><a href="http:&#x2F;&#x2F;www.schnada.de&#x2F;grapt&#x2F;eriknaggum-xmlrant.html" rel="nofollow">http:&#x2F;&#x2F;www.schnada.de&#x2F;grapt&#x2F;eriknaggum-xmlrant.html</a>
davidwover 5 years ago
I love the quote I originally saw on the Nokogiri (Ruby XML lib) site:<p>&quot;XML is like violence – if it doesn’t solve your problems, you are not using enough of it.&quot;
miggolover 5 years ago
The worst example of this that I deal with regularly has to be the libvirt domain XML.<p><a href="https:&#x2F;&#x2F;libvirt.org&#x2F;formatdomain.html" rel="nofollow">https:&#x2F;&#x2F;libvirt.org&#x2F;formatdomain.html</a><p>It does occasionally put information outside of the tags, but because there&#x27;s no logic to when, it&#x27;s nearly worse.
lmilcinover 5 years ago
And JavaScript was never meant to be used to build applications...<p>There are many, many more inventions that are used for different purposes they were meant for.<p>The Internet was created so that US can withstand nuclear attack and it was never meant to be primarily used to spread advertisements.<p>Get over it.
micimizeover 5 years ago
I can&#x27;t find much to corroborate this article&#x27;s take. RDF is a stark counter-example - a standard from the W3C. It has endorsement from Tim Bray, one of XML&#x27;s co-authors.
billpgover 5 years ago
XML with only tags and attributes (nothing but ignored spaces between &#x27;&gt;&#x27; and &#x27;&lt;&#x27;) is a reasonable structured data format.