OLD | NEW |
1 =================================== | 1 =================================== |
2 Technical "whitepaper" for afl-fuzz | 2 Technical "whitepaper" for afl-fuzz |
3 =================================== | 3 =================================== |
4 | 4 |
5 This document provides a quick overview of the guts of American Fuzzy Lop. | 5 This document provides a quick overview of the guts of American Fuzzy Lop. |
6 See README for the general instruction manual; and for a discussion of | 6 See README for the general instruction manual; and for a discussion of |
7 motivations and design goals behind AFL, see historical_notes.txt. | 7 motivations and design goals behind AFL, see historical_notes.txt. |
8 | 8 |
9 0) Design statement | 9 0) Design statement |
10 ------------------- | 10 ------------------- |
(...skipping 67 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
78 2) Detecting new behaviors | 78 2) Detecting new behaviors |
79 -------------------------- | 79 -------------------------- |
80 | 80 |
81 The fuzzer maintains a global map of tuples seen in previous executions; this | 81 The fuzzer maintains a global map of tuples seen in previous executions; this |
82 data can be rapidly compared with individual traces and updated in just a couple | 82 data can be rapidly compared with individual traces and updated in just a couple |
83 of dword- or qword-wide instructions and a simple loop. | 83 of dword- or qword-wide instructions and a simple loop. |
84 | 84 |
85 When a mutated input produces an execution trace containing new tuples, the | 85 When a mutated input produces an execution trace containing new tuples, the |
86 corresponding input file is preserved and routed for additional processing | 86 corresponding input file is preserved and routed for additional processing |
87 later on (see section #3). Inputs that do not trigger new local-scale state | 87 later on (see section #3). Inputs that do not trigger new local-scale state |
88 transitions in the execution trace are discarded, even if their overall | 88 transitions in the execution trace (i.e., produce no new tuples) are discarded, |
89 instrumentation output pattern is unique. | 89 even if their overall control flow sequence is unique. |
90 | 90 |
91 This approach allows for a very fine-grained and long-term exploration of | 91 This approach allows for a very fine-grained and long-term exploration of |
92 program state while not having to perform any computationally intensive and | 92 program state while not having to perform any computationally intensive and |
93 fragile global comparisons of complex execution traces, and while avoiding the | 93 fragile global comparisons of complex execution traces, and while avoiding the |
94 scourge of path explosion. | 94 scourge of path explosion. |
95 | 95 |
96 To illustrate the properties of the algorithm, consider that the second trace | 96 To illustrate the properties of the algorithm, consider that the second trace |
97 shown below would be considered substantially new because of the presence of | 97 shown below would be considered substantially new because of the presence of |
98 new tuples (CA, AE): | 98 new tuples (CA, AE): |
99 | 99 |
100 #1: A -> B -> C -> D -> E | 100 #1: A -> B -> C -> D -> E |
101 #2: A -> B -> C -> A -> E | 101 #2: A -> B -> C -> A -> E |
102 | 102 |
103 At the same time, with #2 processed, the following pattern will not be seen | 103 At the same time, with #2 processed, the following pattern will not be seen |
104 as unique, despite having a markedly different execution path: | 104 as unique, despite having a markedly different overall execution path: |
105 | 105 |
106 #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E | 106 #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E |
107 | 107 |
108 In addition to detecting new tuples, the fuzzer also considers coarse tuple | 108 In addition to detecting new tuples, the fuzzer also considers coarse tuple |
109 hit counts. These are divided into several buckets: | 109 hit counts. These are divided into several buckets: |
110 | 110 |
111 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ | 111 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ |
112 | 112 |
113 To some extent, the number of buckets is an implementation artifact: it allows | 113 To some extent, the number of buckets is an implementation artifact: it allows |
114 an in-place mapping of an 8-bit counter generated by the instrumentation to | 114 an in-place mapping of an 8-bit counter generated by the instrumentation to |
(...skipping 20 matching lines...) Expand all Loading... |
135 the same code. Empirical testing strongly suggests that more generous time | 135 the same code. Empirical testing strongly suggests that more generous time |
136 limits are not worth the cost. | 136 limits are not worth the cost. |
137 | 137 |
138 3) Evolving the input queue | 138 3) Evolving the input queue |
139 --------------------------- | 139 --------------------------- |
140 | 140 |
141 Mutated test cases that produced new state transitions within the program are | 141 Mutated test cases that produced new state transitions within the program are |
142 added to the input queue and used as a starting point for future rounds of | 142 added to the input queue and used as a starting point for future rounds of |
143 fuzzing. They supplement, but do not automatically replace, existing finds. | 143 fuzzing. They supplement, but do not automatically replace, existing finds. |
144 | 144 |
145 This approach allows the tool to progressively explore various disjoint and | 145 In contrast to more greedy genetic algorithms, this approach allows the tool |
146 possibly mutually incompatible features of the underlying data format, as | 146 to progressively explore various disjoint and possibly mutually incompatible |
147 shown in this image: | 147 features of the underlying data format, as shown in this image: |
148 | 148 |
149 http://lcamtuf.coredump.cx/afl/afl_gzip.png | 149 http://lcamtuf.coredump.cx/afl/afl_gzip.png |
150 | 150 |
151 Several practical examples of the results of this algorithm are discussed | 151 Several practical examples of the results of this algorithm are discussed |
152 here: | 152 here: |
153 | 153 |
154 http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html | 154 http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html |
155 http://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.htm
l | 155 http://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.htm
l |
156 | 156 |
157 The synthetic corpus produced by this process is essentially a compact | 157 The synthetic corpus produced by this process is essentially a compact |
(...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
194 Queue extension | Blocks | Edges | Edge hit | Number of unique | 194 Queue extension | Blocks | Edges | Edge hit | Number of unique |
195 strategy used | reached | reached | cnt var | crashes found | 195 strategy used | reached | reached | cnt var | crashes found |
196 ------------------+---------+---------+----------+------------------ | 196 ------------------+---------+---------+----------+------------------ |
197 (Initial file) | 624 | 717 | 1.00 | - | 197 (Initial file) | 624 | 717 | 1.00 | - |
198 | | | | | 198 | | | | |
199 Blind fuzzing | 1,101 | 1,409 | 1.60 | 0 | 199 Blind fuzzing | 1,101 | 1,409 | 1.60 | 0 |
200 Block coverage | 1,255 | 1,649 | 1.48 | 0 | 200 Block coverage | 1,255 | 1,649 | 1.48 | 0 |
201 Edge coverage | 1,259 | 1,734 | 1.72 | 0 | 201 Edge coverage | 1,259 | 1,734 | 1.72 | 0 |
202 AFL model | 1,452 | 2,040 | 3.16 | 1 | 202 AFL model | 1,452 | 2,040 | 3.16 | 1 |
203 | 203 |
204 Some of the earlier work on evolutionary fuzzing suggested maintaining just a | 204 At noted earlier on, some of the prior work on genetic fuzzing relied on |
205 single test case and selecting for mutations that improve coverage. At least | 205 maintaining a single test case and evolving it to maximize coverage. At least |
206 in the tests described above, this "greedy" method appeared to offer no | 206 in the tests described above, this "greedy" approach appears to confer no |
207 substantial benefits over blind fuzzing. | 207 substantial benefits over blind fuzzing strategies. |
208 | 208 |
209 4) Culling the corpus | 209 4) Culling the corpus |
210 --------------------- | 210 --------------------- |
211 | 211 |
212 The progressive state exploration approach outlined above means that some of | 212 The progressive state exploration approach outlined above means that some of |
213 the test cases synthesized later on in the game may have edge coverage that | 213 the test cases synthesized later on in the game may have edge coverage that |
214 is a strict superset of the coverage provided by their ancestors. | 214 is a strict superset of the coverage provided by their ancestors. |
215 | 215 |
216 To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a | 216 To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a |
217 fast algorithm that selects a smaller subset of test cases that still cover | 217 fast algorithm that selects a smaller subset of test cases that still cover |
(...skipping 38 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
256 external tools. | 256 external tools. |
257 | 257 |
258 5) Trimming input files | 258 5) Trimming input files |
259 ----------------------- | 259 ----------------------- |
260 | 260 |
261 File size has a dramatic impact on fuzzing performance, both because large | 261 File size has a dramatic impact on fuzzing performance, both because large |
262 files make the target binary slower, and because they reduce the likelihood | 262 files make the target binary slower, and because they reduce the likelihood |
263 that a mutation would touch important format control structures, rather than | 263 that a mutation would touch important format control structures, rather than |
264 redundant data blocks. This is discussed in more detail in perf_tips.txt. | 264 redundant data blocks. This is discussed in more detail in perf_tips.txt. |
265 | 265 |
266 The possibility of a bad starting corpus provided by the user aside, some | 266 The possibility that the user will provide a low-quality starting corpus aside, |
267 types of mutations can have the effect of iteratively increasing the size of | 267 some types of mutations can have the effect of iteratively increasing the size |
268 the generated files, so it is important to counter this trend. | 268 of the generated files, so it is important to counter this trend. |
269 | 269 |
270 Luckily, the instrumentation feedback provides a simple way to automatically | 270 Luckily, the instrumentation feedback provides a simple way to automatically |
271 trim down input files while ensuring that the changes made to the files have no | 271 trim down input files while ensuring that the changes made to the files have no |
272 impact on the execution path. | 272 impact on the execution path. |
273 | 273 |
274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data | 274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data |
275 with variable length and stepover; any deletion that doesn't affect the checksum | 275 with variable length and stepover; any deletion that doesn't affect the checksum |
276 of the trace map is committed to disk. The trimmer is not designed to be | 276 of the trace map is committed to disk. The trimmer is not designed to be |
277 particularly thorough; instead, it tries to strike a balance between precision | 277 particularly thorough; instead, it tries to strike a balance between precision |
278 and the number of execve() calls spent on the process. The average per-file | 278 and the number of execve() calls spent on the process, selecting the block size |
279 gains are around 5-20%. | 279 and stepover to match. The average per-file gains are around 5-20%. |
280 | 280 |
281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and | 281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and |
282 also attempts to perform alphabet normalization on the trimmed files. | 282 also attempts to perform alphabet normalization on the trimmed files. |
283 | 283 |
284 6) Fuzzing strategies | 284 6) Fuzzing strategies |
285 --------------------- | 285 --------------------- |
286 | 286 |
287 The feedback provided by the instrumentation makes it easy to understand the | 287 The feedback provided by the instrumentation makes it easy to understand the |
288 value of various fuzzing strategies and optimize their parameters so that they | 288 value of various fuzzing strategies and optimize their parameters so that they |
289 work equally well across a wide range of file types. The strategies used by | 289 work equally well across a wide range of file types. The strategies used by |
290 afl-fuzz are generally format-agnostic and are discussed in more detail here: | 290 afl-fuzz are generally format-agnostic and are discussed in more detail here: |
291 | 291 |
292 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html | 292 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html |
293 | 293 |
294 It is somewhat notable that especially early on, most of the work done by | 294 It is somewhat notable that especially early on, most of the work done by |
295 afl-fuzz is actually highly deterministic, and progresses to random stacked | 295 afl-fuzz is actually highly deterministic, and progresses to random stacked |
296 modifications and test case splicing only at a later stage. The deterministic | 296 modifications and test case splicing only at a later stage. The deterministic |
297 strategies include: | 297 strategies include: |
298 | 298 |
299 - Sequential bit flips with varying lengths and stepovers, | 299 - Sequential bit flips with varying lengths and stepovers, |
300 | 300 |
301 - Sequential addition and subtraction of small integers, | 301 - Sequential addition and subtraction of small integers, |
302 | 302 |
303 - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), | 303 - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), |
304 | 304 |
305 The non-deterministic steps include stacked bit flips, insertions, deletions, | 305 The purpose of opening with deterministic steps is related to their tendency to |
306 arithmetics, and splicing of different test cases. | 306 produce compact test cases and small diffs between the non-crashing and crashing |
| 307 inputs. |
307 | 308 |
308 Their relative yields and execve() costs have been investigated and are | 309 With deterministic fuzzing out of the way, the non-deterministic steps include |
309 discussed in the aforementioned blog post. | 310 stacked bit flips, insertions, deletions, arithmetics, and splicing of different |
| 311 test cases. |
| 312 |
| 313 The relative yields and execve() costs of all these strategies have been |
| 314 investigated and are discussed in the aforementioned blog post. |
310 | 315 |
311 For the reasons discussed in historical_notes.txt (chiefly, performance, | 316 For the reasons discussed in historical_notes.txt (chiefly, performance, |
312 simplicity, and reliability), AFL generally does not try to reason about the | 317 simplicity, and reliability), AFL generally does not try to reason about the |
313 relationship between specific mutations and program states; the fuzzing steps | 318 relationship between specific mutations and program states; the fuzzing steps |
314 are nominally blind, and are guided only by the evolutionary design of the | 319 are nominally blind, and are guided only by the evolutionary design of the |
315 input queue. | 320 input queue. |
316 | 321 |
317 That said, there is one (trivial) exception to this rule: when a new queue | 322 That said, there is one (trivial) exception to this rule: when a new queue |
318 entry goes through the initial set of deterministic fuzzing steps, and some | 323 entry goes through the initial set of deterministic fuzzing steps, and tweaks to |
319 regions in the file are observed to have no effect on the checksum of the | 324 some regions in the file are observed to have no effect on the checksum of the |
320 execution path, they may be excluded from the remaining phases of | 325 execution path, they may be excluded from the remaining phases of |
321 deterministic fuzzing - and proceed straight to random tweaks. Especially for | 326 deterministic fuzzing - and the fuzzer may proceed straight to random tweaks. |
322 verbose, human-readable data formats, this can reduce the number of execs by | 327 Especially for verbose, human-readable data formats, this can reduce the number |
323 10-40% or so without an appreciable drop in coverage. In extreme cases, such | 328 of execs by 10-40% or so without an appreciable drop in coverage. In extreme |
324 as normally block-aligned tar archives, the gains can be as high as 90%. | 329 cases, such as normally block-aligned tar archives, the gains can be as high as |
| 330 90%. |
325 | 331 |
326 Because the underlying "effector maps" are local every queue entry and remain | 332 Because the underlying "effector maps" are local every queue entry and remain |
327 in force only during deterministic stages that do not alter the size or the | 333 in force only during deterministic stages that do not alter the size or the |
328 general layout of the underlying file, this mechanism appears to work very | 334 general layout of the underlying file, this mechanism appears to work very |
329 reliably and proved to be simple to implement. | 335 reliably and proved to be simple to implement. |
330 | 336 |
331 7) Dictionaries | 337 7) Dictionaries |
332 --------------- | 338 --------------- |
333 | 339 |
334 The feedback provided by the instrumentation makes it easy to automatically | 340 The feedback provided by the instrumentation makes it easy to automatically |
(...skipping 11 matching lines...) Expand all Loading... |
346 design of the queue together provide a feedback mechanism to differentiate | 352 design of the queue together provide a feedback mechanism to differentiate |
347 between meaningless mutations and ones that trigger new behaviors in the | 353 between meaningless mutations and ones that trigger new behaviors in the |
348 instrumented code - and to incrementally build more complex syntax on top of | 354 instrumented code - and to incrementally build more complex syntax on top of |
349 this discovery. | 355 this discovery. |
350 | 356 |
351 The dictionaries have been shown to enable the fuzzer to rapidly reconstruct | 357 The dictionaries have been shown to enable the fuzzer to rapidly reconstruct |
352 the grammar of highly verbose and complex languages such as JavaScript, SQL, | 358 the grammar of highly verbose and complex languages such as JavaScript, SQL, |
353 or XML; several examples of generated SQL statements are given in the blog | 359 or XML; several examples of generated SQL statements are given in the blog |
354 post mentioned above. | 360 post mentioned above. |
355 | 361 |
| 362 Interestingly, the AFL instrumentation also allows the fuzzer to automatically |
| 363 isolate syntax tokens already present in an input file. It can do so by looking |
| 364 for run of bytes that, when flipped, produce a consistent change to the |
| 365 program's execution path; this is suggestive of an underlying atomic comparison |
| 366 to a predefined value baked into the code. The fuzzer relies on this signal |
| 367 to build compact "auto dictionaries" that are then used in conjunction with |
| 368 other fuzzing strategies. |
| 369 |
356 8) De-duping crashes | 370 8) De-duping crashes |
357 -------------------- | 371 -------------------- |
358 | 372 |
359 De-duplication of crashes is one of the more important problems for any | 373 De-duplication of crashes is one of the more important problems for any |
360 competent fuzzing tool. Many of the naive approaches run into problems; in | 374 competent fuzzing tool. Many of the naive approaches run into problems; in |
361 particular, looking just at the faulting address may lead to completely | 375 particular, looking just at the faulting address may lead to completely |
362 unrelated issues being clustered together if the fault happens in a common | 376 unrelated issues being clustered together if the fault happens in a common |
363 library function (say, strcmp, strcpy); while checksumming call stack | 377 library function (say, strcmp, strcpy); while checksumming call stack |
364 backtraces can lead to extreme crash count inflation if the fault can be | 378 backtraces can lead to extreme crash count inflation if the fault can be |
365 reached through a number of different, possibly recursive code paths. | 379 reached through a number of different, possibly recursive code paths. |
(...skipping 43 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
409 http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html | 423 http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html |
410 | 424 |
411 The fork server is an integral aspect of the injected instrumentation and | 425 The fork server is an integral aspect of the injected instrumentation and |
412 simply stops at the first instrumented function to await commands from | 426 simply stops at the first instrumented function to await commands from |
413 afl-fuzz. | 427 afl-fuzz. |
414 | 428 |
415 With fast targets, the fork server can offer considerable performance gains, | 429 With fast targets, the fork server can offer considerable performance gains, |
416 usually between 1.5x and 2x. It is also possible to: | 430 usually between 1.5x and 2x. It is also possible to: |
417 | 431 |
418 - Use the fork server in manual ("deferred") mode, skipping over larger, | 432 - Use the fork server in manual ("deferred") mode, skipping over larger, |
419 user-selected chunks of initialization code. With some targets, this can | 433 user-selected chunks of initialization code. It requires very modest |
| 434 code changes to the targeted program, and With some targets, can |
420 produce 10x+ performance gains. | 435 produce 10x+ performance gains. |
421 | 436 |
422 - Enable "persistent" mode, where a single process is used to try out | 437 - Enable "persistent" mode, where a single process is used to try out |
423 multiple inputs, greatly limiting the overhead of repetitive fork() | 438 multiple inputs, greatly limiting the overhead of repetitive fork() |
424 calls. As with the previous mode, this requires custom modifications, | 439 calls. This generally requires some code changes to the targeted program, |
425 but can improve the performance of fast targets by a factor of 5 or more | 440 but can improve the performance of fast targets by a factor of 5 or more |
426 - approximating the benefits of in-process fuzzing jobs. | 441 - approximating the benefits of in-process fuzzing jobs while still |
| 442 maintaining very robust isolation between the fuzzer process and the |
| 443 targeted binary. |
427 | 444 |
428 11) Parallelization | 445 11) Parallelization |
429 ------------------- | 446 ------------------- |
430 | 447 |
431 The parallelization mechanism relies on periodically examining the queues | 448 The parallelization mechanism relies on periodically examining the queues |
432 produced by independently-running instances on other CPU cores or on remote | 449 produced by independently-running instances on other CPU cores or on remote |
433 machines, and then selectively pulling in the test cases that produce behaviors | 450 machines, and then selectively pulling in the test cases that, when tried |
434 not yet seen by the fuzzer at hand. | 451 out locally, produce behaviors not yet seen by the fuzzer at hand. |
435 | 452 |
436 This allows for extreme flexibility in fuzzer setup, including running synced | 453 This allows for extreme flexibility in fuzzer setup, including running synced |
437 instances against different parsers of a common data format, often with | 454 instances against different parsers of a common data format, often with |
438 synergistic effects. | 455 synergistic effects. |
439 | 456 |
440 For more information about this design, see parallel_fuzzing.txt. | 457 For more information about this design, see parallel_fuzzing.txt. |
441 | 458 |
442 12) Binary-only instrumentation | 459 12) Binary-only instrumentation |
443 ------------------------------- | 460 ------------------------------- |
444 | 461 |
(...skipping 61 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
506 unlikely. This is suggestive of a checksum or other "magic" integer. | 523 unlikely. This is suggestive of a checksum or other "magic" integer. |
507 | 524 |
508 - "Suspected checksummed block" - a long block of data where any change | 525 - "Suspected checksummed block" - a long block of data where any change |
509 always triggers the same new execution path. Likely caused by failing | 526 always triggers the same new execution path. Likely caused by failing |
510 a checksum or a similar integrity check before any subsequent parsing | 527 a checksum or a similar integrity check before any subsequent parsing |
511 takes place. | 528 takes place. |
512 | 529 |
513 - "Magic value section" - a generic token where changes cause the type | 530 - "Magic value section" - a generic token where changes cause the type |
514 of binary behavior outlined earlier, but that doesn't meet any of the | 531 of binary behavior outlined earlier, but that doesn't meet any of the |
515 other criteria. May be an atomically compared keyword or so. | 532 other criteria. May be an atomically compared keyword or so. |
OLD | NEW |