Coordinated disclosure of XML roundtrip vulnerabilities in Go’s standard library

223 点作者 jupenur超过 4 年前

11 条评论

tedunangst超过 4 年前

I've never liked (nor understood the popularity) of signature schemes that require parsing before verification. This has also led to problems with X.509. And DKIM. And plists. And package managers. And more.It's much simpler to sign the entire message, unparsed, and it's immune to these issues.We went through a decade of debate before deciding that "encrypt then mac" is the only right way to do things. That knowledge hasn't trickled down to other domains.

评论 #25425082 未加载

评论 #25424881 未加载

评论 #25428264 未加载

评论 #25425279 未加载

评论 #25423231 未加载

评论 #25428633 未加载

russell_h超过 4 年前

I'm the maintainer of one of the affected SAML libraries.People need to stop using SAML. This needs to be a priority. A little background, for those who haven't had the displeasure of working with it:When a user wants to log into an application (the "Service Provider"), and is required to SSO against an "Identity Provider", the Identity Provider basically generates an XML document with information about the user, then signs that document using a thing known as an XML Digital Signature, or XMLDSIG.When you think of "signing" a document, normally you would serialize that document out to bytes, apply your signature scheme over the bytes, then send along both the bytes and the signature. But for reasons which are irrelevant to modern implementations, XMLDSIG prefers to stuff the signature metadata back inside the XML document that was just signed. Obviously this invalidates the signature, so you also inject some metadata instructing receivers on how to put the document back how it was. There are several algorithms available for this. Then you ship around that XML document. Basically means that when the Identity Provider receives one of these documents it needs to:<pre><code> 1. Parse the XML document (which cannot yet be trusted) 2. Find the signature inside the document 3. Find the metadata about what algorithm(s) to use to restore the document 4. Run the document through whatever transforms are described in that metadata (keep in mind that up to this point the document might well have been supplied by an attacker) 5. Serialize the transformed document back out to bytes, being careful not to touch any whitespace, etc 6. Verify the signature over the re-serialized document </code></pre> If all of this succeeds and was implemented perfectly, you can trust the output of step 5. Ideally you should re-parse it. A common failure mode is trusting the original input instead, so be careful about that.Obviously this is a crazy approach to one of the most security-critical parts of an application on the internet, and it breaks all the time.Unfortunately people persist in using this fundamentally broken protocol, so huge thank you to the team at Mattermost for their research in this area.

评论 #25425134 未加载

评论 #25424425 未加载

评论 #25428057 未加载

评论 #25425094 未加载

评论 #25426003 未加载

评论 #25429649 未加载

tannhaeuser超过 4 年前

XML namespaces were controversial when introduced, and their implementation as privileged "xmlns:..." attributes with complex scoping, layering, and defaulting rules have been criticized many times; see [1] for a reflection from 2010 by an insider admitting to the fact that "every step on the process that led to the current situation with XML Namespaces seems reasonable".When in 1996-98 W3C/The SGML Extended Review Board subset XML from SGML to define a generic markup convention for use with the expected wealth of upcoming vocabularies on the web, the issue of name collisions between elements (and attributes) from different vocabularies was deemed significant. Of course, in hindsight, with only SVG and MathML (and rarely HTML 5 in XHTML serialization) left on the web and having been incorporated as foreign elements directly into HTML, this seems overkill (even though there are actually collisions between eg. the title element in SVG vs HTML).There's an alternative (and saner IMHO) approach for dealing with XML namespaces in ISO/IEC 19757-9 [2] by just presenting a canonical (ie. always the same) namespace prefix as part of an element name by a parser API to an app, guided by processing instructions for binding canonical namespace prefixes to namespace URLs, which might also help enterprise-y XML with lots of XML Schema use. Of course, this doesn't help with roundtripping xmlns-bindings (eg. with their exact ordering, possible redundancy, temporary/insignificant namespace prefixes, re-binding in document fragments etc.) through DOM representations, which seems the problem here.[1]: <a href="https://blog.jclark.com/2010/01/xml-namespaces.html" rel="nofollow">https://blog.jclark.com/2010/01/xml-namespaces.html</a>[2]: <a href="https://www.iso.org/obp/ui/#iso:std:iso-iec:19757:-9:ed-1:v1:en" rel="nofollow">https://www.iso.org/obp/ui/#iso:std:iso-iec:19757:-9:ed-1:v1...</a>

warp超过 4 年前

> Despite significant efforts by the Go security team, it has not been possible to patch the vulnerabilities discussed in this blog post.Well, that is not something you want to see in a public disclosure.

评论 #25422111 未加载

评论 #25422143 未加载

jarym超过 4 年前

Glad this got found. I remember when XML was being widely adopted that there'd be frequent vulnerabilities found in Java-based parsers.A large part of this stems from how complicated XML can get - if it were only elements and attributes it might have been fine. Namespaces made it a bit more complicated. Processing Instructions made it hideous.

评论 #25425392 未加载

评论 #25421910 未加载

chrsig超过 4 年前

It's worth noting that the go1 compatibility promise[1] allows for breaking compatibility in the name of security issues:>Security. A security issue in the specification or implementation may come to light whose resolution requires breaking compatibility. We reserve the right to address such security issues.If the go team decides that this issue is worth a breaking change is another question entirely.[1]<a href="https://golang.org/doc/go1compat" rel="nofollow">https://golang.org/doc/go1compat</a>

jerf超过 4 年前

Anyone have examples of XML that can be mutated? My guess is that it wouldn't take much.I expect that a similar problem will be found in many other libraries, if the XML was publicized. XML namespaces made a critical... "mistake" is probably too strong, but "design choice that deviated too far from people's mental model" is about right... that has prevented them from being anywhere near as useful or safe as they could be. In an XML document using XML namespaces, "ns1:tagname" may not equal "ns1:tagname", and "ns1:tagname" can be equal to "ns2:tagname". This breaks people's mental models of how XML works, and correspondingly, breaks people's code that manipulates XML.(I actually used the Go XML library as an SVG validator in the ~1.8 timeframe and had to fork it to fix namespaces well enough to serve in that role. I didn't know about how to exploit it in a specific XML protocol but I've know about the issues for a while. "Why didn't you upstream it then?" Well, as this security bulletin implies, the data structures in encoding/xml are fundamentally wrong for namespaced XML to be round-tripped and there is no backwards-compatible solution to the problem, so it was obvious to me without even trying that it would be rejected. This has also been discussed on a number of tickets subsequently over the years, so that XML namespace handling is weak in the standard library is not news to the Go developers. Note also that it's "round-tripping" that is the problem; if you parse & consume you can write correct code, it's the sending it back out that can be problematic.)Namespaces fundamentally rewrite the nature of XML tag and attribute names. No longer are they just strings; now they are tuples of the form (namespace URL, tag name)... and namespace URL is NOT the prefix that shows up before the colon! The prefix is an abbreviation of an earlier tag declaration. So in the XML<pre><code> <tag xmlns="https://sample.com/1" xmlns:example1="https://blah.org/1"> <example1:tag xmlns:example2="https://blah.org/2"> <example2:tag xmlns:example1="https://anewsite.com/xmlns"> <example1:tag /> </example2:tag> </example1:tag> </tag> </code></pre> not a SINGLE ONE of those "tag"s is the same! They are, respectively, actually (<a href="https://sample.com/1" rel="nofollow">https://sample.com/1</a>, tag), (<a href="https://blah.org/1" rel="nofollow">https://blah.org/1</a>, tag), (<a href="https://blah.org/2" rel="nofollow">https://blah.org/2</a>, tag), and (<a href="https://anewsite.com/xmlns" rel="nofollow">https://anewsite.com/xmlns</a>, tag). There's a ton of code, and indeed, even quite a few standards, that will get that wrong. (Note the redefinition of 'example1' in there; that is perfectly legal.) Even more excitingly,<pre><code> <tag xmlns="https://sample.com/1" xmlns:example1="https://sample.com/1"> <example1:tag/> <example2:tag xmlns:example2="https://sample.com/1" /> </tag> </code></pre> ARE all the exact tag and should be treated as such, despite the different "tag names" appearing.Reserializing these can be exciting, because A: Your XML library, in principle, ought to be presenting you the (XMLNS, tagname) tuple with the abbreviation stripped away, to discourage you from paying too much attention to the abbreviation but B: humans in general and a lot of code expect the namespace abbreviations to stay the same in a round trip, and may even standardize on what the abbreviations should be. There's a LOT of code out there in the world looking for "'p' or 'xhtml:p'" as the tag name and not ("<a href="http://www.w3.org/1999/xhtml" rel="nofollow">http://www.w3.org/1999/xhtml</a>", "p").In general, to maintain roundtrip equality, you have to either A: maintain a table of the abbreviations you see, when they were introduced, and also which was used or B: just use the (XMLNS, tagname) and ensure that while outputing that the relevant namespaces have always been declared. Generally for me I go for option B as it's generally easier to get correct and I pair it with a table of the most common namespaces for what I'm working in, so that, for example, XHTML gets a hard-coded "xhtml:" prefix. It is very easy if you try to implement A to screw it up in a way that can corrupt the namespaces on some input.(Option B has its own pathologies. Consider:<pre><code> <tag xmlns:sample="https://example.com/1"> <sample:tag1 /> <sample:tag2 /> </tag> </code></pre> It's really easy to write code that will drop the xmlns specification on all of the children of "tag", since it didn't use it there, and if your code throws away where the XMLNS was declared and just looks to whether the NS is currently declared, it'll see a new declaration of the "sample" namespace on every usage. Technically correct if the downstream code handles namespaces correctly (big if!), but visually unappealing.)Not defending Go here, except inasmuch as it's such a common error to make that I have a hard time naming libraries and standards that get namespaces completely correct, for as simple as they are in principle. (I think SVG and XHTML have it right. XMPP is very, very close, but still has a few places where the "stream" tag is placed in different namespaces and you're just supposed to know to handle it the same in all the namespaces it appears it... which most people do only because it doesn't occur to them that technically these are separate tags, so it all kinda works out in the end.... libxml2 is correct but I've seen a lot of things that build on top of it and they almost all screw up namespaces.)

评论 #25422579 未加载

评论 #25422270 未加载

评论 #25422612 未加载

评论 #25435654 未加载

评论 #25429720 未加载

评论 #25422049 未加载

GauntletWizard超过 4 年前

I'd like to ask everyone here who's familiar with SAML to take a look at SPIFFE[1], which underlies Istio.I'm biased in this regard, but I view SPIFFE's inclusion of JWT Tokens as an authentication method as fundamentally flawed - By allowing bearer tokens, you are no longer verifying identity, but passing identity around. JWT has also been susceptible in the past[2] to the same kinds of attacks here - Poorly defined verification semantics.I suspect that buried in the semantics around SPIFFE's SPIRE Server and Agent are a number of vulnerabilities or other ways that trust doesn't mean quite what you think it means. I'd love for someone with interest to take a look. Besides the obvious downsides fundamental to Isitio's MITM Proxy architecture, I think there's more lurking on that edge.[1] <a href="https://spiffe.io/" rel="nofollow">https://spiffe.io/</a> [2] <a href="https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/" rel="nofollow">https://auth0.com/blog/critical-vulnerabilities-in-json-web-...</a>

nimish超过 4 年前

`encoding/xml` has had broken handling of namespaces for a long time. It’s possible to hack it on but the only reasonable choice is to use a libxml2 binding which also gets you canonicalization, another can of worms.Unsurprised it can cause security issues, especially in XML-DSig which is a nightmare to handle correctly.

评论 #25424581 未加载

random5634超过 4 年前

XMLI'm amazed people can get it as right as they do half the time? I do think Go will get fixed eventually. It's just too weird if they couldn't fix the core issue? But I've never used XML if I can help it, so I'm absolutely no expert on what would make it impossible to fix something like this.