OLD | NEW |
(Empty) | |
| 1 ================ |
| 2 Historical notes |
| 3 ================ |
| 4 |
| 5 This doc talks about the rationale of some of the high-level design decisions |
| 6 for American Fuzzy Lop. It's adopted from a discussion with Rob Graham. |
| 7 See README for the general instruction manual, and technical_details.txt for |
| 8 additional implementation-level insights. |
| 9 |
| 10 1) Influences |
| 11 ------------- |
| 12 |
| 13 In short, afl-fuzz is inspired chiefly by the work done by Tavis Ormandy back |
| 14 in 2007. Tavis did some very persuasive experiments using gcov block coverage |
| 15 to select optimal test cases out of a large corpus of data, and then using |
| 16 them as a starting point for traditional fuzzing workflows. |
| 17 |
| 18 (By "persuasive", I mean: netting a significant number of interesting |
| 19 vulnerabilities.) |
| 20 |
| 21 In parallel to this, both Tavis and I were interested in evolutionary fuzzing. |
| 22 Tavis had his experiments, and I was working on a tool called bunny-the-fuzzer, |
| 23 released somewhere in 2007. |
| 24 |
| 25 Bunny used a generational algorithm not much different from afl-fuzz, but |
| 26 also tried to reason about the relationship between various input bits and |
| 27 the internal state of the program, with hopes of deriving some additional value |
| 28 from that. The reasoning / correlation part was probably in part inspired by |
| 29 other projects done around the same time by Will Drewry and Chris Evans. |
| 30 |
| 31 The state correlation approach sounded very sexy on paper, but ultimately, made |
| 32 the fuzzer complicated, brittle, and cumbersome to use; every other target |
| 33 program would require a tweak or two. Because Bunny didn't fare a whole lot |
| 34 better than less sophisticated brute-force tools, I eventually decided to write |
| 35 it off. You can still find its original documentation at: |
| 36 |
| 37 https://code.google.com/p/bunny-the-fuzzer/wiki/BunnyDoc |
| 38 |
| 39 There has been a fair amount of independent work, too. Most notably, a few |
| 40 weeks earlier that year, Jared DeMott had a Defcon presentation about a |
| 41 coverage-driven fuzzer that relied on coverage as a fitness function. |
| 42 |
| 43 Jared's approach was by no means identical to what afl-fuzz does, but it was in |
| 44 the same ballpark. His fuzzer tried to explicitly solve for the maximum coverage |
| 45 with a single input file; in comparison, afl simply selects for cases that do |
| 46 something new (which yields better results - see technical_details.txt). |
| 47 |
| 48 A few years later, Gabriel Campana released fuzzgrind, a tool that relied purely |
| 49 on Valgrind and a constraint solver to maximize coverage without any brute-force |
| 50 bits; and Microsoft Research folks talked extensively about their still |
| 51 non-public, solver-based SAGE framework. |
| 52 |
| 53 In the past six years or so, I've also seen a fair number of academic papers |
| 54 that dealt with smart fuzzing (focusing chiefly on symbolic execution) and a |
| 55 couple papers that discussed proof-of-concept applications of genetic |
| 56 algorithms with the same goals in mind. I'm unconvinced how practical most of |
| 57 these experiments were; I suspect that many of them suffer from the |
| 58 bunny-the-fuzzer's curse of being cool on paper and in carefully designed |
| 59 experiments, but failing the ultimate test of being able to find new, |
| 60 worthwhile security bugs in otherwise well-fuzzed, real-world software. |
| 61 |
| 62 In some ways, the baseline that the "cool" solutions have to compete against is |
| 63 a lot more impressive than it may seem, making it difficult for competitors to |
| 64 stand out. For a singular example, check out the work by Gynvael and Mateusz |
| 65 Jurczyk, applying "dumb" fuzzing to ffmpeg, a prominent and security-critical |
| 66 component of modern browsers and media players: |
| 67 |
| 68 http://googleonlinesecurity.blogspot.com/2014/01/ffmpeg-and-thousand-fixes.htm
l |
| 69 |
| 70 Effortlessly getting comparable results with state-of-the-art symbolic execution |
| 71 in equally complex software still seems fairly unlikely, and hasn't been |
| 72 demonstrated in practice so far. |
| 73 |
| 74 But I digress; ultimately, attribution is hard, and glorying the fundamental |
| 75 concepts behind AFL is probably a waste of time. The devil is very much in the |
| 76 often-overlooked details, which brings us to... |
| 77 |
| 78 2) Design goals for afl-fuzz |
| 79 ---------------------------- |
| 80 |
| 81 In short, I believe that the current implementation of afl-fuzz takes care of |
| 82 several itches that seemed impossible to scratch with other tools: |
| 83 |
| 84 1) Speed. It's genuinely hard to compete with brute force when your "smart" |
| 85 approach is resource-intensive. If your instrumentation makes it 10x more |
| 86 likely to find a bug, but runs 100x slower, your users are getting a bad |
| 87 deal. |
| 88 |
| 89 To avoid starting with a handicap, afl-fuzz is meant to let you fuzz most of |
| 90 the intended targets at roughly their native speed - so even if it doesn't |
| 91 add value, you do not lose much. |
| 92 |
| 93 On top of this, the tool leverages instrumentation to actually reduce the |
| 94 amount of work in a couple of ways: for example, by carefully trimming the |
| 95 corpus or skipping non-functional but non-trimmable regions in the input |
| 96 files. |
| 97 |
| 98 2) Rock-solid reliability. It's hard to compete with brute force if your |
| 99 approach is brittle and fails unexpectedly. Automated testing is attractive |
| 100 because it's simple to use and scalable; anything that goes against these |
| 101 principles is an unwelcome trade-off and means that your tool will be used |
| 102 less often and with less consistent results. |
| 103 |
| 104 Most of the approaches based on symbolic execution, taint tracking, or |
| 105 complex syntax-aware instrumentation are currently fairly unreliable with |
| 106 real-world targets. Perhaps more importantly, their failure modes can render |
| 107 them strictly worse than "dumb" tools, and such degradation can be difficult |
| 108 for less experienced users to notice and correct. |
| 109 |
| 110 In contrast, afl-fuzz is designed to be rock solid, chiefly by keeping it |
| 111 simple. In fact, at its core, it's designed to be just a very good |
| 112 traditional fuzzer with a wide range of interesting, well-researched |
| 113 strategies to go by. The fancy parts just help it focus the effort in |
| 114 places where it matters the most. |
| 115 |
| 116 3) Simplicity. The author of a testing framework is probably the only person |
| 117 who truly understands the impact of all the settings offered by the tool - |
| 118 and who can dial them in just right. Yet, even the most rudimentary fuzzer |
| 119 frameworks often come with countless knobs and fuzzing ratios that need to |
| 120 be guessed by the operator ahead of the time. This can do more harm than |
| 121 good. |
| 122 |
| 123 AFL is designed to avoid this as much as possible. The three knobs you |
| 124 can play with are the output file, the memory limit, and the ability to |
| 125 override the default, auto-calibrated timeout. The rest is just supposed to |
| 126 work. When it doesn't, user-friendly error messages outline the probable |
| 127 causes and workarounds, and get you back on track right away. |
| 128 |
| 129 4) Chainability. Most general-purpose fuzzers can't be easily employed |
| 130 against resource-hungry or interaction-heavy tools, necessitating the |
| 131 creation of custom in-process fuzzers or the investment of massive CPU |
| 132 power (most of which is wasted on tasks not directly related to the code |
| 133 we actually want to test). |
| 134 |
| 135 AFL tries to scratch this itch by allowing users to use more lightweight |
| 136 targets (e.g., standalone image parsing libraries) to create small |
| 137 corpora of interesting test cases that can be fed into a manual testing |
| 138 process or a UI harness later on. |
| 139 |
| 140 As mentioned in technical_details.txt, AFL does all this not by systematically |
| 141 applying a single overarching CS concept, but by experimenting with a variety |
| 142 of small, complementary methods that were shown to reliably yields results |
| 143 better than chance. The use of instrumentation is a part of that toolkit, but is |
| 144 far from being the most important one. |
| 145 |
| 146 Ultimately, what matters is that afl-fuzz is designed to find cool bugs - and |
| 147 has a pretty robust track record of doing just that. |
OLD | NEW |