TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Nobody expects CDATA sections in XML

219 pointsby dmitover 10 years ago

14 comments

NelsonMinarover 10 years ago
The irony is that CDATA isn&#x27;t even very useful; there&#x27;s no way to escape the ]]&gt; closing tag so you still have to invent some special escaping mechanism to use it.<p>Nobody expects entity definitions in XML either, and yet about once a year some new service or software is found vulnerable to XXE attacks. (Summary: a lot of XML parsers can be made to open arbitrary files or network sockets and sometimes return the content.)<p>XML is a ridiculously complex document format designed for editing text documents. It is not a suitable data interchange format. Fortunately we have JSON now.
评论 #8681011 未加载
评论 #8678561 未加载
评论 #8679566 未加载
评论 #8679082 未加载
评论 #8680254 未加载
评论 #8679104 未加载
评论 #8679885 未加载
评论 #8679911 未加载
评论 #8678587 未加载
评论 #8678566 未加载
0x0over 10 years ago
I&#x27;ve been following posts about this tool for a few weeks and it is really remarkable how many interesting results are already popping out already. In particular since static analyzers have been around for years and years.<p>I&#x27;m assuming afl-fuzz is particularly CPU-bound, and it would be interesting to see some numbers about how many CPU years are being dedicated to it at the moment - and if we would see even more interesting stuff if a larger compute cluster was made available.<p>It&#x27;s also super scary how &quot;effortlessly&quot; these bugs appear to be uncovered, even in &quot;well-aged&quot; software like &quot;strings&quot;.
评论 #8679459 未加载
xendoover 10 years ago
Recently I find it harder and harder to believe that lcamtuf is just one person.
评论 #8678501 未加载
评论 #8679638 未加载
评论 #8678581 未加载
评论 #8679562 未加载
al2o3crover 10 years ago
Heads-up to the &quot;comment without reading the article&quot; crowd: the title is <i>not</i> bemoaning a lack of handling for CDATA in existing parsers. It&#x27;s discussing an interesting behavior of the AFL fuzzer when used with formats that require fixed strings in particular places...<p>Related: NOBODY EXPECTS THE SPANISH INQUISITION, either. :)
adnamover 10 years ago
This is completely tangential, but I&#x27;m waiting for someone to create a breakfast cereal called Funroll Loops. You know, for the kids.
seba_dos1over 10 years ago
How long till afl-fuzz reaches consciousness?
评论 #8678637 未加载
serve_yayover 10 years ago
Wow, what an enjoyable read. I recommend the story about randomly generating JPG files too.
mikeknoopover 10 years ago
This thread reminded me of a draft post I&#x27;ve been sitting on for a while, related to ENTITY tags in XML and XXE exploits.<p>Basically, it&#x27;s really easy to leave default XML parsing settings (for things like consuming RSS feeds) and accidentally open yourself up to reading files off the filesystem.<p>I did a full write-up and POC here: <a href="http://mikeknoop.com/lxml-xxe-exploit" rel="nofollow">http:&#x2F;&#x2F;mikeknoop.com&#x2F;lxml-xxe-exploit</a>
userbinatorover 10 years ago
I&#x27;m actually not so surprised, given what the fuzzer does - mutating input to make forward progress in the code. Incremental string comparisons definitely fall under this category since they have a very straightforward definition of &quot;forward progress&quot;; either the byte is correct and we can enter a previously unvisited state, or it&#x27;s incorrect and execution flows down the unsuccessful path. It&#x27;s somewhat like the infinite monkey theorem, except the random stream is being filtered such that only a correct subsequence is needed to advance.<p>On the other hand, I&#x27;d be astonished if it managed to fuzz its way through a hash-based comparison (especially one involving crypto like SHA1 or MD5.)
评论 #8679416 未加载
评论 #8679334 未加载
评论 #8680015 未加载
backspacesover 10 years ago
But of course no one uses either when there&#x27;s Atom&#x2F;GitHub&#x27;s favorite: CSON. <a href="https://github.com/bevry/cson" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;bevry&#x2F;cson</a>
nickbaumanover 10 years ago
Tell the people that created the webservice I have to consume this!
bostonpeteover 10 years ago
I didn&#x27;t expect a kind of Spanish Inquisition...
pjmlpover 10 years ago
Maybe C based XML parsers don&#x27;t, but JVM and .NET based XML parsers don&#x27;t have any issues with CDATA sections.<p>Time to upgrade to more modern tools?
评论 #8678538 未加载
评论 #8678475 未加载
评论 #8678471 未加载
评论 #8678472 未加载
评论 #8678675 未加载
评论 #8679067 未加载
brabbitover 10 years ago
I am not sure but what is the actual harm of it?