Don't Invent XML Languages (2006)

48 pointsby MrVandemarabout 1 year ago

13 comments

account-5about 1 year ago

This is less about hating on XML and more about not reinventing the wheel.I quite like XML. Things like xpath make working with it, or getting data from it much easier than JSON; though I love jq syntax and can't wait until it starts being incorporated into languages. I don't even mind xslt provided it's not being over used.

评论 #39651762 未加载

评论 #39652382 未加载

评论 #39650189 未加载

userbinatorabout 1 year ago

2006, well into the era of XML being the trendy fad that every piece of Serious Business software was supposed to use.Now 18 years later, JSON seems to have displaced it.Personally, I've never found text-based formats to be a good choice for data that humans will rarely need to read or write; I much prefer simple and efficient binary formats, which can be just as extensible without the additional inefficiency and needlessly-introduced-failures of string handling.

评论 #39650064 未加载

评论 #39649963 未加载

评论 #39650207 未加载

评论 #39652067 未加载

评论 #39650237 未加载

评论 #39649955 未加载

评论 #39650163 未加载

评论 #39650007 未加载

评论 #39652009 未加载

评论 #39650170 未加载

评论 #39652098 未加载

评论 #39650195 未加载

lpapezabout 1 year ago

Thankfully we've moved on and no longer invent XML languages.We use YAML now which is obviously much better.

评论 #39650992 未加载

评论 #39650173 未加载

评论 #39651726 未加载

simpaticoderabout 1 year ago

If you squint at XML, JSON, or YAML you see a kind of lispy data-structure shape, an n-arry tree. The reader has a context stack that they are pushing and popping from as they read. The real problem is that every problem space is isomorphic to one that has successively tighter context. And a format that is applicable to every problem is one that is applicable to no problem. I believe that computer languages must get worse at some things to get better at others in a zero-sum way. Any attempt to avoid this trade-off leaves you with a very powerful mush.

评论 #39658941 未加载

435345345about 1 year ago

I don't even know why we have any *ML. Everything they can do can be done with Lisp-syntax better.

评论 #39649819 未加载

评论 #39649889 未加载

评论 #39649832 未加载

dkerstenabout 1 year ago

I still see people, in 2024, writing new software and using XML as the data format. I don't have an example offhand, but I recently saw a hobby game engine using XML to store its engine-specific game object/scene data.Personally, I like to use TOML for anything that is likely to also be edited by humans and JSON or binary for something that will only ever be used by machines.

评论 #39652279 未加载

评论 #39652036 未加载

评论 #39652063 未加载

greenyodaabout 1 year ago

Another standardized XML language that the article doesn't mention is RDF: <a href="https://en.wikipedia.org/wiki/Resource_Description_Framework" rel="nofollow">https://en.wikipedia.org/wiki/Resource_Description_Framework</a>

评论 #39650121 未加载

评论 #39650028 未加载

olivierggabout 1 year ago

I still remember working briefly in 2008 with a thing called Magic (now uniPaas I think) XML all way down. Even the code : each line was a XML . You were supposed to code with a weird GUI. Still have nightmares

评论 #39650128 未加载

joshlkabout 1 year ago

Side question: when did XML start to loose favour to JSON? Did this happen because of something in particular or was it a gradual transition?

评论 #39650167 未加载

评论 #39650048 未加载

评论 #39650050 未加载

评论 #39650142 未加载

评论 #39649946 未加载

评论 #39650118 未加载

评论 #39650077 未加载

评论 #39649993 未加载

评论 #39651714 未加载

评论 #39651199 未加载

jiggawattsabout 1 year ago

I love listening to young developers guess at the history of XML, and why it was "complex" (it wasn't), and then turn around an reinvent that wheel, with every bit of complexity that they just said they didn't like... because it's necessary.So a bit of history from someone who was already developing for over a decade when XML was the new hotness:The before times were bad. Really bad. Everybody and everything had their own text-based formats.[1] I don't just mean a few minor variants of INI files. I mean wildly different formats in different character encodings, which were literally never provided. Niceties like UTF-8 weren't even dreamt of yet.Literally every application interpreted their config files differently, generated output logs differently, and spoke "text" over the network or the pipeline differently.If you need to read, write, send, or receive N different text formats, you needed at least N parsers and N serializers.Those parsers and serializers didn't exist.They just didn't. The formats were not formally specified, they were just "whatever some program does"... "on some machine". Yup. They output different text encodings on different machines. Or the same machine even! Seriously, if two users had different regional options, they might not be able to share files generated by the same application on the same box.Basically, you either had a programming "library" available so that you could completely sidestep the issue and avoid the text, or you'd have to write your own parser, personally, by hand. I loooved the early versions of ANTLR because they made this at least tolerable. Either way, good luck handling all the corner-cases of escaping control characters inside a quoted string that also supports macro escapes, embedded sub-expressions, or whatever. Fun times.Then XML came along.It precisely specified the syntax, and there were off-the-shelf parsers and generators for it in multiple programming languages! You could generate an XML file on one platform and read it in a different language on another by including a standardised library that you could just download instead of typing in a parser by hand like an animal. It even specified the text encoding so you wouldn't have to guess.It was glorious.Microsoft especially embraced it and to this day you can see a lot of that history in Visual Studio project files, ASP.NET web config files, and the like.The reason JSON slowly overtook XML is many-fold, but the key reason is simple: It was easier to parse JSON into JavaScript objects in the browser, and the browser was taking off as an application developer platform exponentially. JavaScript programmers outnumbered everyone else combined.Notably, the early versions of JSON were typically read using just the "eval()" function.[2] It wasn't an encoding per-se, but just a subset of JavaScript. Compared to having to have an XML parser in JavaScript, it was very lightweight. In fact, zero weight, because if JavaScript was available, then by definition, JSON was available.The timeline is important here. An in-browser XML parser was available before JSON was a thing, but only for IE 5 on Windows. JSON was invented in 2001, and XMLHttpRequest become consistently available in other browsers after 2005 and was only a standard in 2006. Truly universal adoption took a few more years after that.XML was only "complex" because it's not an object-notation like JSON is. It's a document markup language, much like HTML. Both trace their roots back to SGML, which dates back to 1986. These types of languages were used in places like Boeing for records keeping, such as tracking complex structured and semi-structured information about aircraft parts over decades. That kind of problem has an essential complexity that can't be wished away.JSON is simpler for data exchange because it maps nicely to how object oriented languages store pure data, but it can't be readily used to represent human-readable documents the way XML can.The other simplification was that JSON did away with schemas and the like, and was commonly used with dynamic languages. Developers got into the habit of reading JSON by shoving it into an object, and then interpreting it directly without any kind of parsing or decoding layer. This works kinda-sorta in languages like Python or JavaScript, but is horrific when used at scale.I'm a developer used to simply clicking a button in Visual Studio to have it instantly bulk-generate entire API client libraries from a WSDL XML API schema, documentation and all. So when I hear REST people talk about how much simpler JSON is, I have no idea what they're talking about.So now, slowly, the wheel is being reinvented to avoid the manual labour of RETS and return to machine automation we had with WS-*. There are JSON API schemas (multiple!), written in JSON (of course), so documentation can't be expressed in-line (because JSON is not a markup language). I'm seeing declarative languages like workflow engines and API management expression written in JSON gibberish now, same as we did with XML twenty years ago.Mark my words, it's just a matter of time until someone invents JSON namespaces...[1] Most of the older Linux applications still do, which makes it ever so much fun to robustly modify config files programatically.[2] Sure, these days JSON is "parsed" even by browsers instead of sent to eval(), for security reasons, but that's not how things started out.

评论 #39652045 未加载

评论 #39651982 未加载

maxrecursionabout 1 year ago

I used the OVAL "Open source Vulnerability Assessment Language", written in XML, daily to automate STIGs. Finding documentation for it was awful, but once I knew the syntax development was a breeze. Most chill job I ever had. A job like that is my retirement plan once I have enough money that salary no longer matters.

megaperplexabout 1 year ago

The author makes an argument against designing new XML languages. I think his arguments are weak. This does not mean I think we should design more XML languages, but that the arguments this particular author brings against it are weak. That having be said, the mid section with the tooling suggestions by use case is neat.One thing he condemns such endeavors for is that it is unpleasant and somehow "political". I can see what he means, but this has nothing to do with "overdoing the extensibility" of XML. As Aaron Schwartz put it"Instead of the "let's just build something that works" attitude that made the Web (and the Internet) such a roaring success, they brought the formalizing mindset of mathematicians and the institutional structures of academics and defense contractors. They formed committees to form working groups to write drafts of ontologies that carefully listed (in 100-page Word documents) all possible things in the universe and the various properties they could have, and they spent ours in Talmudic debates over whether a washing machine was a kitchen appliance or a household cleaning device. [<a href="https://www.cs.rpi.edu/~hendler/ProgrammableWebSwartz2009.html" rel="nofollow">https://www.cs.rpi.edu/~hendler/ProgrammableWebSwartz2009.ht...</a>]"It is true that similar endeavors are prone to looking for an Absolute Cosmic Eternal Perfect Ontological Structure (credit: Lion Kimbro). If you drop that idea in any office, you will get as many proposals for entities as there are anuses, as if anyone is entitled to an ontology.Don't get me wrong, anyone might be entitled to submit an entity or criticize a hierarchy, but I think this is meaningful mostly in the context of targeted audience research and agile development practices. All in all, I think that the problem here is not with the 'X' in XML, but with poor organization-level practices.Furthermore, I did follow the link and surveyed the XML languages. I did not see the apparently self-evident truth the writer sees in there. Sure, there are many of them, but how is this even an argument? Some of the listed languages seem quite cool to me, especially the science ones. And the next person might dig the legal ones. If the argument here is that "there are so many of them languages, they just can all be important" (or "real") does not sit well with me. There are tons of different programming languages, web frameworks, linux distributions, not to mention the incomprehensible multitude in other domains, such as car maker models or, well, birds.It is just simplistic to disparage any number of things because they are too many to make readily sense of, and this is a cognitive stance I can't endorse. Look at Medical Subject Headings, or the Dewey Decimal or the Library of Congress cataloging systems. There is just a ton of things out there and for each one of those, there is a person that has more expertise on than yourself. These taxonomies might be important to them, what are you gonna do? Stop them?A bird's view exasperation of the sheer number of things is the hallmark of a small town mentality that is untenable for the hacktivist mindset. The response here is, I guess, reusability of existing standards, and agile practices involving the user in the development process. But the author did not bring up any of these.

calvinmorrisonabout 1 year ago

I hate xml.

评论 #39651956 未加载