FormatFuzzer is a framework for high-efficiency, high-quality generation and parsing of binary inputs.
by Andreas Zeller
So you have written a program that processes GIFs. How do you test it? One common way to do so would be to collect a set of GIFs from the Internet (where there’s no shortage of GIFs), and to test your program on these. Obviously, if your program fails to process one of these GIFs, you had better fix it. Here’s some neat GIF from Wikipedia; you can use it to test your program.
Unfortunately, the GIFs that are present on the Internet may not encompass the full range of features that GIFs (or your program) have to offer. As the Wikipedia article on GIFs will be happy to lay out, GIFs can have several obscure features such as tiles, layers, CLEAR codes, and lots of metadata. If your program is subject to third-party inputs (say, when someone uploads a GIF to your application), it had better handle all of these features – or at least not crash or hang. So, you need more GIFs. Where do you get them from?
In the past years, fuzzers have become one of the most important tools to generate test inputs – in particular, to find vulnerabilities and other robustness issues. Popular fuzzers such as AFL mutate a set of inputs again and again, guided by achieving a maximum coverage of the program under test.
Could you thus use such a fuzzer to test your program? The answer is: yes and no. Since a fuzzer mutates given GIF files, what it will generate first and foremost is plenty of invalid GIF files. You will thus thoroughly test the parser that reads in a file, as well as error handling. So, yes. However, it is unlikely that a mutation will actually create a new GIF feature, unless it is already contained in one of the given GIFs. So, no.
This is where language-based fuzzers come in. A language-based fuzzer such as FormatFuzzer
uses a format description called binary template to generate millions of inputs that adhere to this very format. Using a binary template for GIFs, for instance, FormatFuzzer
can generate millions of GIFs, all valid, and including all features that the GIF format has to offer. A simple
$ make gif-fuzzer
suffices, and you get a super-efficient GIF generator gif-fuzzer
, which you can invoke as
$ ./gif-fuzzer fuzz foo.gif
to create a GIF file foo.gif
.
Here’s one of the GIFs produced by FormatFuzzer – six-rectangles.gif, an animated series of six black rectangles. It may look unspectacular, but it covers plenty of GIF features, including animation. You can create a large number of these, and put your program to the test.
Interestingly, six-rectangles.gif
renders differently in different browsers. On Safari 15.1, it renders as a big black rectangle:
On Firefox 94.0, it renders as a small black rectangle:
And on Google Chrome 95.0, it renders as white space:
So, with the help of FormatFuzzer
, we already detected an inconsistency in how GIF files are processed. Which one is the correct behavior?
Once set up for a particular format, tools like FormatFuzzer
can mbe tremendously useful. However, someone has to write these binary templates in the first place - which means digging through file format specifications and encoding their rules such that FormatFuzzer
can process them. We are currently exploring ways to convert more existing formats, and also to mine such formats from existing programs. However, many common formats already are encoded as binary templates (including GIFs!) and it only takes little effort to make them suitable for high-quality production of inputs.
GIF is one of the formats that is already fully supported by FormatFuzzer
. Hence, you can happily produce as many GIFs as you like, and test your program against them. Happy fuzzing!