This is INSANELY COOL.<p>If it's smart enough to learn how to build a JPEG in a day, use it with netcat and it could probably send quite a lot of things down in flames.<p>Who needs static analysis :) ?
I remember a very similar technique being used successfully for automatically cracking software (registration keys/keyfiles, serial numbers) before Internet-based validation and stronger crypto became common; the difference is that method didn't require having access to any source code or recompiling the target, as it just traced execution and "evolved" itself toward inputs producing longer and wider (i.e. more locations in the binary) traces.
The author of this article is a hacker from the time, when the word hacker meant something different than it does today. I remember his website from my early teens when I started using the internet via a dial-up connection back in 1998. Lcamtuf, glad to see you're still around. Your fellow countryman.
Potential instructions for trying this on Mac (I was unable to make it work, perhaps we can build upon this):<p>curl -LO <a href="http://lcamtuf.coredump.cx/afl.tgz" rel="nofollow">http://lcamtuf.coredump.cx/afl.tgz</a><p>tar zxvf afl.tgz<p>rm afl.tgz<p>cd afl*<p>make afl-gcc<p>make afl-fuzz<p>mkdir in_dir<p>echo 'hello' >in_dir/hello<p># there is a glitch with the libjpeg-turbo-1.3.1 configure file that makes it difficult to compile on Mac, so I tried regular libjpeg:<p>curl -LO <a href="http://www.ijg.org/files/jpegsrc.v8c.tar.gz" rel="nofollow">http://www.ijg.org/files/jpegsrc.v8c.tar.gz</a><p>tar zxvf jpegsrc.v8c.tar.gz<p>cd jpeg-8c/<p>CC=../afl-gcc ./configure<p>make<p># error: C compiler cannot create executables<p># if the above command worked to build an instrumented djpeg, then this should work<p>cd ..<p>./afl-fuzz -i in_dir -o out_dir ./jpeg-8c/djpeg
Regarding<p>>if (strcmp(header.magic_password, "h4ck3d by p1gZ")) goto terminate_now;<p>How impossible would it be to look at the branching instruction, perform a taint analysis on its input and see if there is any part of the input we can tweak to make it branch/not branch.
Like, we jumped because the zero flag was set. And the zero flags was set because these two bytes were equal. Hmm that byte is hardcoded. This other byte was mov'd here from that memory address. That memory address was set by this call to fread... hey, it come from this byte in the input file.
See also: Microsoft Code Digger [1], which generates inputs using symbolic execution for .net code, and EvoSuite, which uses a genetic algorithm to do the same for Java [2].<p>[1] : <a href="http://blogs.msdn.com/b/nikolait/archive/2013/04/23/introducing-code-digger-an-extension-for-vs2012.aspx" rel="nofollow">http://blogs.msdn.com/b/nikolait/archive/2013/04/23/introduc...</a><p>[2] : <a href="http://www.evosuite.org" rel="nofollow">http://www.evosuite.org</a>
I like to imagine that given enough time it eventually generates the Lenna [1] jpeg and exits<p>[1] : <a href="https://en.wikipedia.org/wiki/Lenna" rel="nofollow">https://en.wikipedia.org/wiki/Lenna</a>
I had a brief 'It's alive :O' moment when reading this, imagine seeing face looking at you in one of those pics :)<p>Nice article, concept of fuzzers was new to me.
Wow, two awesome ideas in a week. Reminds me of this posted just a couple days ago <a href="http://reverseocr.tumblr.com/" rel="nofollow">http://reverseocr.tumblr.com/</a>
Now to try this with midi...<p>But what to feed it into? I could make some musical analysis stuff, but do I need to write it in C to avoid accidentally fuzzing my interpreter?
You can throw afl-fuzz at many other types of parsers with similar results: with bash, it will write valid scripts;<p>^ that seems fun, I just don't think I would run it on my machine for fear of what it might create (oh.. rm -rf * ok!)
This is totally amazing! Wondering if it would be possible to go the other way around: from generated JPG to a string. If yes, what a cool way to send your password as a... JPG over email.