TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

XSS war: a powerful Java HTML sanitizer

13 点作者 robicch超过 15 年前

7 条评论

simonw超过 15 年前
On further inspection, I don't trust your implementation at all. You're blacklisting CSS rules and attributes rather than whitelisting them. This means you wouldn't catch attacks like this one, for example:<p><a href="http://www.davidpashley.com/blog/computing/livejournal-mozilla-bug.html" rel="nofollow">http://www.davidpashley.com/blog/computing/livejournal-mozil...</a><p>I also don't think you're the right thing when you DO find something that's not on the whitelist - you should be escaping it rather than stripping it (we couldn't have a discussion about your code using your system, since our XSS examples would be stripped).<p>I'd suggest re-engineering to use a whitelist for everything.
simonw超过 15 年前
Allowing any CSS at all is very risky indeed. There was a brilliant phishing attack on MySpace a few years ago where the attacker constructed their own "log in" link and used CSS absolute positioning to overlay it across the real "log in" link in the global navigation. They stole 30,000+ accounts.<p>Even if you filter out "position: absolute", there's a chance people might figure out a way to do something similar using enormous padding values or negative margins.<p>Your general approach (tokenise the HTML and use a whitelist) is an OK start, but you should be white-listing attributes as well. You should also have an ENORMOUS set of unit tests.<p>You allow object and embed which is very worrying - the allowScriptAccess attribute can allow Flash to make JavaScript calls to the parent page, for example.<p>Also remember this: you're not dealing with valid HTML, you're dealing with malicious HTML that might be designed to evade your filters but still be handled by browser's built-in error correction code. Since the most widely used HTML engine is closed source, there's no telling what kind of weird constructs might be error-corrected and rendered by IE.<p>HTML cleansing is a mine-field.
bensummers超过 15 年前
This is important reading:<p><a href="http://www.feedparser.org/docs/html-sanitization.html" rel="nofollow">http://www.feedparser.org/docs/html-sanitization.html</a><p>I'm not convinced it's possible to stop a browser executing code, because there are so many possible ways a browser can be given code to execute. Not only do you have to read all the specs for all the versions of the browsers, you've got to find all the bugs in them too.<p>Case in point: An earlier version of this code didn't remove javascript from CSS expressions, making it possible to get past it in IE6 and 7.<p>I gave up trying to sanitize HTML, and instead used a library to render it to plain text and stuck it in a &#60;pre&#62; element with usual HTML escaping. But I need to take a very paranoid approach in my app.<p>EDIT to add: I think this is probably one of the better java implementation, and has a good whitelisting approach to HTML. However, it's let down by taking a blacklist approach to CSS.
评论 #1044899 未加载
评论 #1044929 未加载
simonw超过 15 年前
This works:<p>&#60;s c r i p t&#62;alert('xss')&#60;/ s c r i p t&#62;<p>I think older versions of IE might execute this - the LiveJournal XSS filter has defence against this one. At the very least though it should be escaped rather than being allowed through on to the page.
评论 #1045168 未加载
RyanMcGreal超过 15 年前
&#62; Our approach is to remove unwanted tags and properties<p>I'm no security guru, but isn't the best practice to have a whitelist of accepted elements rather than a blacklist of prohibited elements?
robicch超过 15 年前
I know that the black list approach to css is dangerous, but there is someone that saw the alert on the page <a href="http://patapage.com/applications/pataPage/site/test/testSanitize.jsp" rel="nofollow">http://patapage.com/applications/pataPage/site/test/testSani...</a>? and if yes how?
tdoggette超过 15 年前
Try middle-clicking a link on that page in Chrome. Tt opens in the same tab (for me at least), but I get normal behavior in IE7.