PH.D DEFENCE - PUBLIC SEMINAR

Enhancing Directed Search in Black-box, Grey-box and White-box Fuzz Testing

Speaker
Mr Pham Van Thuan
Advisor
Dr Abhik Roychoudhury, Professor, School of Computing


09 Jun 2017 Friday, 04:00 PM to 05:30 PM

Video Conference Room, COM1-02-13

Abstract:

Security bugs can exist in every single software system, and software testing aims to find these bugs ahead of attackers, who are hunting for zero-day bugs and/or managing to exploit them for profit (e.g., by stealing users' credentials like credit card information) or to cause serious problems (e.g., by attacking critical systems like nuclear power plant). Fuzz testing (or fuzzing) techniques, which include (model-based) black-box, coverage-based grey-box and white-box approaches, have become prominent for software testing. However, given an inadequate test suite they are not skilled at directing the exploration to reach given target locations and expose bugs in large program binaries that take highly-structured inputs. We observe that these limitations can be circumvented by improving the directedness of fuzzing approaches.

In this thesis, we first design a directed search algorithm in Hercules, a symbolic execution based white-box fuzzing engine working directly on large multi-module (stripped) program binaries. The directedness of Hercules is attributed to its ability to steer the exploration towards target locations using the module dependency graph and control flow graph lifted directly from application binaries. Moreover, by exploiting the results produced by SMT constraint solver (e.g., minimal unsatisfiable core), Hercules systematically navigates the search between non-crashing paths and crashing ones. White-box fuzzing tools like Hercules excel at reasoning about values of data fields but it could easily get "stuck'' at synthesizing the whole (optional) data block (a.k.a data chunk) which may not exist in an inadequate test suite of highly-structured inputs like PNG, WAV and PDF files. To tackle this problem, we develop MoWF --- a novel combination of model-based black-box fuzzing (as embodied by Peach fuzzer) and directed white-box fuzzing (as embodied by Hercules). In this combination setup, Hercules can inform Peach about where it gets stuck. Peach takes this information and leverages the input model to generate and transfer the missing data block to current input of Hercules, helping Hercules get unstuck and continue its directed exploration. Apart from expensive symbolic analysis based approaches, the directedness can also be achieved by augmenting coverage-based grey-box fuzzing (CGF), a more lightweight technique. We build AFLGo, a directed CGF, by integrating Simulated Annealing global search algorithm into the fuzzing process so that the testing is steered towards target locations with a higher probability than other locations.

The experimental evaluations on two applications of directed fuzzing -- crash reproduction and patch testing for vulnerability detection -- show that Hercules, MoWF and AFLGo effectively guide the search and successfully reproduce crashes in large real-world (binary) programs (e.g., Adobe Reader, Windows Media Player, OpenSSL) taking highly-structured file formats (e.g., PNG, WAV, PDF). Notably, AFLGo can expose the famous HeartBleed vulnerability almost four (4) times faster than the state-of-the-art AFL fuzzer. In addition, AFLGo has discovered 14 zero-day vulnerabilities in the widely-used Binutils toolset. All the vulnerabilities have been confirmed and fixed by Binutils' maintainers, and we have obtained five (5) CVEs assigned to the most critical ones.