TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Open-Sourcing ClusterFuzz

292 pointsby markoaover 6 years ago

9 comments

metzmanjover 6 years ago
I work on this. Happy to answer questions if people have any.
评论 #19106985 未加载
评论 #19107669 未加载
评论 #19106984 未加载
评论 #19106989 未加载
评论 #19106991 未加载
评论 #19107014 未加载
评论 #19108554 未加载
评论 #19108988 未加载
boulosover 6 years ago
Disclosure: I work on Google Cloud.<p>I&#x27;m super pleased to see this! Abhishek and the cluterfuzz team were one of our initial customers for Preemptible VMs, still are, and make for a great example. Congrats to the team!
guidovrankenover 6 years ago
I don&#x27;t want to hijack the thread subject but here are my thoughts on the usefulness of fuzzing of safe languages.<p>Even in the absence of memory corruption bugs there is a subclass of bugs that can emerge in any general-purpose language, like slowness&#x2F;hangs, assert failures, panics and excessive resource consumption.<p>Barring those, you can detect invariant violations, (de)serialization inconsistencies (eg. deserialize(serialize(input)) != input, eg. see [1]), different behavior across multiple libraries whose semantics must be identical (crypto currency implementations are notable in this regard as deviation from the spec or canonical implementation in the execution of scripts or smart contracts can lead to chain splits).<p>With some effort you can do differential 64 bit&#x2F;32 bit fuzzing on the same machine, and I&#x27;ve found interesting discrepancies between the interpretation of numeric values in JSON parsers, which makes sense if you think about it (size_t and float have a different size on each architecture, causing the 32 bit parser to truncate values). This might be applicable to every language that does not guarantee type sizes across architectures like Go (not sure?), but I haven&#x27;t tested that yet.<p>You can detect path escape&#x2F;traversal (which is entirely language-agnostic but potentially severe) by asserting that any absolute path that is ever accessed within an app has a legal path, or by fuzzing a path sanitizer specifically.<p>And so on.<p>Code coverage is the primary metric used in fuzzing, but other metrics can be useful as well. I&#x27;ve experimented extensively with metrics such as allocation, code intensity (number of basic blocks executed) (which helped me prove that V8&#x27;s WASM JIT compiler can be subjected to inputs of average size that take &gt;20 seconds to compile), and stack depth, see also [2].<p>Any quantifier can be used as a fuzzing metric, for example the largest difference between two variables in your program.<p>Let&#x27;s say you have a decompression algorithm that takes C as an input and outputs D. Calculate R = len(D) &#x2F; len(C), so that R is the ratio between compressed input and decompressed output. Use R as a fuzzing metric and the fuzzer will tend to generate inputs that have a high compressed&#x2F;decompressed size ratio, possibly leading to the discovery of decompression bombs [3].<p>Wrt. this, libFuzzer now also natively supports custom counters I believe [4].<p>Based on Rody Kersten&#x27;s work I implemented libFuzzer-based fuzzing of Java applications supporting code coverage, intensity and allocation metrics [5], and it should not be difficult to plug this into ClusterFuzz&#x2F;oss-fuzz.<p>Feel free to get in touch if you have any questions or need help.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;nlohmann&#x2F;json&#x2F;blob&#x2F;develop&#x2F;test&#x2F;src&#x2F;fuzzer-parse_json.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;nlohmann&#x2F;json&#x2F;blob&#x2F;develop&#x2F;test&#x2F;src&#x2F;fuzze...</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;guidovranken&#x2F;libfuzzer-gv" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;guidovranken&#x2F;libfuzzer-gv</a><p>[3] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Zip_bomb" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Zip_bomb</a><p>[4] <a href="https:&#x2F;&#x2F;llvm.org&#x2F;docs&#x2F;doxygen&#x2F;FuzzerExtraCounters_8cpp_source.html" rel="nofollow">https:&#x2F;&#x2F;llvm.org&#x2F;docs&#x2F;doxygen&#x2F;FuzzerExtraCounters_8cpp_sourc...</a><p>[5] <a href="https:&#x2F;&#x2F;github.com&#x2F;guidovranken&#x2F;libfuzzer-java" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;guidovranken&#x2F;libfuzzer-java</a>
评论 #19108417 未加载
rarecoilover 6 years ago
Thank you for open sourcing this. For those interested in trying multiple cluster-based fuzzing solutions, I&#x27;d also like to point at yahoo&#x2F;yfuzz[1], which is k8s-backed.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;yahoo&#x2F;yfuzz" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;yahoo&#x2F;yfuzz</a>
bobwaycottover 6 years ago
For those interested in the repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;clusterfuzz" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;clusterfuzz</a>
polskibusover 6 years ago
Is there a fuzzing tool oriented towards web applications? Something that could generate loads of Selenium cases automatically and verify whether the application crashes, logs an exception or continues to work smoothly??
评论 #19107199 未加载
Insanityover 6 years ago
Perhaps a noobie question, but it mentions c&#x2F;c++ specifically. How does this hold up for Go? Where you have pointers but no pointer arithmetic?
评论 #19107345 未加载
评论 #19107256 未加载
painfulover 6 years ago
How about people stop using unsafe languages such as C and C++?
评论 #19107588 未加载
评论 #19107822 未加载
评论 #19107467 未加载
评论 #19107455 未加载
syastrovover 6 years ago
Makes you think about choosing to write software in C &#x2F; C++ &#x2F; other non-memory-safe languages when you need 25000 cores churning away to ensure you don’t make mistakes that could cause serious security issues.<p>It makes me wonder why Google wouldn’t put their efforts into using Rust, for example.<p>Of course, server power is cheap, but not for our planet.
评论 #19107145 未加载
评论 #19107508 未加载
评论 #19108319 未加载
评论 #19107678 未加载
评论 #19107198 未加载
评论 #19107170 未加载
评论 #19107215 未加载