OLD | NEW |
(Empty) | |
| 1 ========================================================= |
| 2 High-performance binary-only instrumentation for afl-fuzz |
| 3 ========================================================= |
| 4 |
| 5 (See ../docs/README for the general instruction manual.) |
| 6 |
| 7 1) Introduction |
| 8 --------------- |
| 9 |
| 10 The code in this directory allows you to build a standalone feature that |
| 11 leverages the QEMU "user emulation" mode and allows callers to obtain |
| 12 instrumentation output for black-box, closed-source binaries. This mechanism |
| 13 can be then used by afl-fuzz to stress-test targets that couldn't be built |
| 14 with afl-gcc. |
| 15 |
| 16 The usual performance cost is 2-5x, which is considerably better than |
| 17 seen so far in experiments with tools such as DynamoRIO and PIN. |
| 18 |
| 19 The idea and much of the implementation comes from Andrew Griffiths. |
| 20 |
| 21 2) How to use |
| 22 ------------- |
| 23 |
| 24 The feature is implemented with a fairly simple patch to QEMU 2.3.0. The |
| 25 simplest way to build it is to run ./build_qemu_support.sh. The script will |
| 26 download, configure, and compile the QEMU binary for you. |
| 27 |
| 28 QEMU is a big project, so this will take a while, and you may have to |
| 29 resolve a couple of dependencies (most notably, you will definitely need |
| 30 libtool and glib2-devel). |
| 31 |
| 32 Once the binaries are compiled, you can leverage the QEMU tool by calling |
| 33 afl-fuzz and all the related utilities with -Q in the command line. |
| 34 |
| 35 Note that QEMU requires a generous memory limit to run; somewhere around |
| 36 200 MB is a good starting point, but considerably more may be needed for |
| 37 more complex programs. The default -m limit will be automatically bumped up |
| 38 to 200 MB when specifying -Q to afl-fuzz; be careful when overriding this. |
| 39 |
| 40 In principle, if you set CPU_TARGET before calling ./build_qemu_support.sh, |
| 41 you should get a build capable of running non-native binaries (say, you |
| 42 can try CPU_TARGET=arm). This is also necessary for running 32-bit binaries |
| 43 on a 64-bit system (CPU_TARGET=i386). |
| 44 |
| 45 Note: if you want the QEMU helper to be installed on your system for all |
| 46 users, you need to build it before issuing 'make install' in the parent |
| 47 directory. |
| 48 |
| 49 3) Notes on linking |
| 50 ------------------- |
| 51 |
| 52 The feature is supported only on Linux. Supporting BSD may amount to porting |
| 53 the changes made to linux-user/elfload.c and applying them to |
| 54 bsd-user/elfload.c, but I have not looked into this yet. |
| 55 |
| 56 The instrumentation follows only the .text section of the first ELF binary |
| 57 encountered in the linking process. It does not trace shared libraries. In |
| 58 practice, this means two things: |
| 59 |
| 60 - Any libraries you want to analyze *must* be linked statically into the |
| 61 executed ELF file (this will usually be the case for closed-source |
| 62 apps). |
| 63 |
| 64 - Standard C libraries and other stuff that is wasteful to instrument |
| 65 should be linked dynamically - otherwise, AFL will have no way to avoid |
| 66 peeking into them. |
| 67 |
| 68 Setting AFL_INST_LIBS=1 can be used to circumvent the .text detection logic |
| 69 and instrument every basic block encountered. |
| 70 |
| 71 4) Benchmarking |
| 72 --------------- |
| 73 |
| 74 If you want to compare the performance of the QEMU instrumentation with that of |
| 75 afl-gcc compiled code against the same target, you need to build the |
| 76 non-instrumented binary with the same optimization flags that are normally |
| 77 injected by afl-gcc, and make sure that the bits to be tested are statically |
| 78 linked into the binary. A common way to do this would be: |
| 79 |
| 80 $ CFLAGS="-O3 -funroll-loops" ./configure --disable-shared |
| 81 $ make clean all |
| 82 |
| 83 Comparative measurements of execution speed or instrumentation coverage will be |
| 84 fairly meaningless if the optimization levels or instrumentation scopes don't |
| 85 match. |
| 86 |
| 87 5) Gotchas, feedback, bugs |
| 88 -------------------------- |
| 89 |
| 90 If you need to fix up checksums or do other cleanup on mutated test cases, see |
| 91 experimental/post_library/ for a viable solution. |
| 92 |
| 93 Do not mix QEMU mode with ASAN, MSAN, or the likes; QEMU doesn't appreciate |
| 94 the "shadow VM" trick employed by the sanitizers and will probably just |
| 95 run out of memory. |
| 96 |
| 97 Compared to fully-fledged virtualization, the user emulation mode is *NOT* a |
| 98 security boundary. The binaries can freely interact with the host OS. If you |
| 99 somehow need to fuzz an untrusted binary, put everything in a sandbox first. |
| 100 |
| 101 Beyond that, this is an early-stage mechanism, so fields reports are welcome. |
| 102 You can send them to <afl-users@googlegroups.com>. |
| 103 |
| 104 6) Alternatives: static rewriting |
| 105 --------------------------------- |
| 106 |
| 107 Statically rewriting binaries just once, instead of attempting to translate |
| 108 them at run time, can be a faster alternative. That said, static rewriting is |
| 109 fraught with peril, because it depends on being able to properly and fully model |
| 110 program control flow without actually executing each and every code path. |
| 111 |
| 112 If you want to experiment with this mode of operation, there is a module |
| 113 contributed by Aleksandar Nikolich: |
| 114 |
| 115 https://github.com/vrtadmin/moflow/tree/master/afl-dyninst |
| 116 https://groups.google.com/forum/#!topic/afl-users/HlSQdbOTlpg |
| 117 |
| 118 At this point, the author reports the possibility of hiccups with stripped |
| 119 binaries. That said, if we can get it to be comparably reliable to QEMU, we may |
| 120 decide to switch to this mode, but I had no time to play with it yet. |
OLD | NEW |