The Future of Fuzzing (from Fuzzing and Code Coverage)
kowsik guruswami sent a message today to dd about using code coverage to help build better fuzzers.
i have many thoughts on this subject. here is my reply email:
on mon, 26 mar 2007, kowsik wrote:
> we just released rcov-0.1, an interactive/incremental code coverage
> tool to assist in building effective fuzzers.
> quick summary:
> – it’s a webrick browser-based application (ruby)
> – uses gcov’s notes/data files to get at blocks and function summaries
> – interactively/incrementally shows the coverage information while fuzzing
> – uses ctags to cross reference functions/prototypes/definitions/macros
hi kowsik, thanks for this.
i have a few notes though, as i believe this can be taken much further (at least my studies so far show that).
we have three levels or layers (depends on approach):
1. building better fuzzers (which you cover).
2. helping the fuzzing process, fuzzing better.
3. making the process of finding the actual vulnerability once an indication is found (a successful test case, or as they say in qa, a passing one) easier.
several folks in the past few months have said that fuzzing isn’t new and has been done for years – that much is true.
some folks also said that fuzzing is as simple as it gets and has no where left to evolve. that is indeed very much false.
code coverage, static analysis, run-time analysis.. etc. all have a place in the future of fuzzing.
i see fuzzers development in coming years as changing the term “dumb fuzzing” to mean today’s protocol-based smart fuzzing, and “smart fuzzing” being about what interactive changes are happening as you fuzz.
the most that we see today (in most cases) is the engine running undisturbed, while the monitor (if such even exists) being a simple debugger.
evolving host and network monitoring to use profiling technologies, map functions and paths, watch for memory issues, etc. is fast coming.
today, changing the action of a fuzzer as it is running is difficult (there is no real driver, just an engine). a simple example for this evolution could be watching for cpu uage. if the cpu usage spikes it could mean:
1. we are sending too many requests per second – we should slow down the engine.
2. (if for the thread itself) we are on to something, we should explore this attack (likely 10000 “attacks” we went through) or adjust to a different fuzzing engine to explore that particular section of the program (as we mapped it – code coverage again).
the two don’t easily work together, not to mention even stopping a fuzzer, rewinding it or god forbid running a different one at the same time (on the same instance anyway).
which brings us to distributed fuzzing… but that’s a whole different subject yet again.
fuzzing has a long way to go, and we didn’t even really start to explore full intergration with static analysis tools (other than with results).
we had a discussion on the fuzzing mailing list recently about genetic fuzzing, but i dam not really a math geek. jared can explain that one better… and so on.
all that before we explore uses for fuzzing outside of the development cycle (mostly security qa) and vulnerability research, which is with client-side testing. perhaps fuzzers will help us force the hand of software vendors to develop more robust and secure code.
working for a fuzzing vendor i am only too familiar with the turing halting problem and seeking reality in the midst of eternal runs, but the most interesting thing i found in the past few months (which wasn’t technical) is the clash of cultures between qa engineers and security professionals. it will be very interesting to see where we end up.