I was working on Distributed Services for AIX in 1986 and 1987, a distributed filesystem to compete with NFS. As this was being developed by a dev team, my colleague and I pondered how to test the system that we had architected.<p>There were so many possible states that a system's file system can be in. Were the conventional tests going to catch subtle bugs? Here's an example of an unusual but not unheard of issue: in a Unix system a file can be unlinked and hence deleted from all directories while remaining open by one or more processes. Such a file could still be written to and read by multiple processes at least until it was finally closed by all processes having open file descriptors at which point it would disappear from the system. Does the distributed file system model this correctly? Many other strange combinations of system calls might be applied to the file system. Will the tests exercise these.<p>It occurred to me that the "correct" behavior for any sequence of system calls could be defined by just running the sequence on a local file system and comparing the results with running an identical sequence of system calls against the distributed file system.<p>I built a system to generate random sequences of file related system calls that would run these on a local file system and a remote distributed file system that I wanted to test. As soon as a difference in outcome resulted the test would halt and save a log of the sequence of operations.<p>My experience with this test rig was interesting. At first discrepancies happened right away. As each bug was fixed by the dev team, we would start the stochastic testing again, and then a new bug would be found. Over time the test would run for a few minutes before failure and then a few minute longer and finally for hours and hours. It was a really ingesting and effective way to find some of the more subtle bugs in the code. I don't recall if I published this information internally or not.
For those of you in FDA regulated devices, my clients started receiving FDA NSE letters for not performing fuzz testing. For example, "Though you have provided penetration testing, it does not appear that you have addressed the other items identified such as static and dynamic code analysis, malformed input (fuzz) testing, or vulnerability scanning. This testing is necessary to assess the effectiveness of the cybersecurity controls implemented and to determine whether the residual risk of your device is acceptable."
The relative rarity of input (pseudo-)randomization in SW testing is near inexplicable to me, except by the very low cost of all but the most commonly reproducing bugs paid by the SW vendor.
Hello, I'm Jaromir, one of the core engineers at QuestDB team. I just noticed this blog is trending! Andrei - the author - lives in Bulgaria and he is probably already sleeping. Happy to answer any question the blog left unanswered.
(Sorry for hijacking the thread with a general fuzz question.)<p>I want to do fuzz testing on a library/framework written in C++. The actual target to test against is a simulator that takes input from a socket according to a network protocol. This simulator is built on both Linux and Windows.<p>What fuzzing frameworks would you recommend for this? There are quite a few and not always easy for me (as a fuzzing beginner) to understand the differences between them. Some of them seems to be abandoned projects as well.