| OLD | NEW |
| 1 =================================== | 1 =================================== |
| 2 Technical "whitepaper" for afl-fuzz | 2 Technical "whitepaper" for afl-fuzz |
| 3 =================================== | 3 =================================== |
| 4 | 4 |
| 5 This document provides a quick overview of the guts of American Fuzzy Lop. | 5 This document provides a quick overview of the guts of American Fuzzy Lop. |
| 6 See README for the general instruction manual; and for a discussion of | 6 See README for the general instruction manual; and for a discussion of |
| 7 motivations and design goals behind AFL, see historical_notes.txt. | 7 motivations and design goals behind AFL, see historical_notes.txt. |
| 8 | 8 |
| 9 0) Design statement | 9 0) Design statement |
| 10 ------------------- | 10 ------------------- |
| (...skipping 67 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 78 2) Detecting new behaviors | 78 2) Detecting new behaviors |
| 79 -------------------------- | 79 -------------------------- |
| 80 | 80 |
| 81 The fuzzer maintains a global map of tuples seen in previous executions; this | 81 The fuzzer maintains a global map of tuples seen in previous executions; this |
| 82 data can be rapidly compared with individual traces and updated in just a couple | 82 data can be rapidly compared with individual traces and updated in just a couple |
| 83 of dword- or qword-wide instructions and a simple loop. | 83 of dword- or qword-wide instructions and a simple loop. |
| 84 | 84 |
| 85 When a mutated input produces an execution trace containing new tuples, the | 85 When a mutated input produces an execution trace containing new tuples, the |
| 86 corresponding input file is preserved and routed for additional processing | 86 corresponding input file is preserved and routed for additional processing |
| 87 later on (see section #3). Inputs that do not trigger new local-scale state | 87 later on (see section #3). Inputs that do not trigger new local-scale state |
| 88 transitions in the execution trace are discarded, even if their overall | 88 transitions in the execution trace (i.e., produce no new tuples) are discarded, |
| 89 instrumentation output pattern is unique. | 89 even if their overall control flow sequence is unique. |
| 90 | 90 |
| 91 This approach allows for a very fine-grained and long-term exploration of | 91 This approach allows for a very fine-grained and long-term exploration of |
| 92 program state while not having to perform any computationally intensive and | 92 program state while not having to perform any computationally intensive and |
| 93 fragile global comparisons of complex execution traces, and while avoiding the | 93 fragile global comparisons of complex execution traces, and while avoiding the |
| 94 scourge of path explosion. | 94 scourge of path explosion. |
| 95 | 95 |
| 96 To illustrate the properties of the algorithm, consider that the second trace | 96 To illustrate the properties of the algorithm, consider that the second trace |
| 97 shown below would be considered substantially new because of the presence of | 97 shown below would be considered substantially new because of the presence of |
| 98 new tuples (CA, AE): | 98 new tuples (CA, AE): |
| 99 | 99 |
| 100 #1: A -> B -> C -> D -> E | 100 #1: A -> B -> C -> D -> E |
| 101 #2: A -> B -> C -> A -> E | 101 #2: A -> B -> C -> A -> E |
| 102 | 102 |
| 103 At the same time, with #2 processed, the following pattern will not be seen | 103 At the same time, with #2 processed, the following pattern will not be seen |
| 104 as unique, despite having a markedly different execution path: | 104 as unique, despite having a markedly different overall execution path: |
| 105 | 105 |
| 106 #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E | 106 #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E |
| 107 | 107 |
| 108 In addition to detecting new tuples, the fuzzer also considers coarse tuple | 108 In addition to detecting new tuples, the fuzzer also considers coarse tuple |
| 109 hit counts. These are divided into several buckets: | 109 hit counts. These are divided into several buckets: |
| 110 | 110 |
| 111 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ | 111 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ |
| 112 | 112 |
| 113 To some extent, the number of buckets is an implementation artifact: it allows | 113 To some extent, the number of buckets is an implementation artifact: it allows |
| 114 an in-place mapping of an 8-bit counter generated by the instrumentation to | 114 an in-place mapping of an 8-bit counter generated by the instrumentation to |
| (...skipping 20 matching lines...) Expand all Loading... |
| 135 the same code. Empirical testing strongly suggests that more generous time | 135 the same code. Empirical testing strongly suggests that more generous time |
| 136 limits are not worth the cost. | 136 limits are not worth the cost. |
| 137 | 137 |
| 138 3) Evolving the input queue | 138 3) Evolving the input queue |
| 139 --------------------------- | 139 --------------------------- |
| 140 | 140 |
| 141 Mutated test cases that produced new state transitions within the program are | 141 Mutated test cases that produced new state transitions within the program are |
| 142 added to the input queue and used as a starting point for future rounds of | 142 added to the input queue and used as a starting point for future rounds of |
| 143 fuzzing. They supplement, but do not automatically replace, existing finds. | 143 fuzzing. They supplement, but do not automatically replace, existing finds. |
| 144 | 144 |
| 145 This approach allows the tool to progressively explore various disjoint and | 145 In contrast to more greedy genetic algorithms, this approach allows the tool |
| 146 possibly mutually incompatible features of the underlying data format, as | 146 to progressively explore various disjoint and possibly mutually incompatible |
| 147 shown in this image: | 147 features of the underlying data format, as shown in this image: |
| 148 | 148 |
| 149 http://lcamtuf.coredump.cx/afl/afl_gzip.png | 149 http://lcamtuf.coredump.cx/afl/afl_gzip.png |
| 150 | 150 |
| 151 Several practical examples of the results of this algorithm are discussed | 151 Several practical examples of the results of this algorithm are discussed |
| 152 here: | 152 here: |
| 153 | 153 |
| 154 http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html | 154 http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html |
| 155 http://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.htm
l | 155 http://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.htm
l |
| 156 | 156 |
| 157 The synthetic corpus produced by this process is essentially a compact | 157 The synthetic corpus produced by this process is essentially a compact |
| (...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 194 Queue extension | Blocks | Edges | Edge hit | Number of unique | 194 Queue extension | Blocks | Edges | Edge hit | Number of unique |
| 195 strategy used | reached | reached | cnt var | crashes found | 195 strategy used | reached | reached | cnt var | crashes found |
| 196 ------------------+---------+---------+----------+------------------ | 196 ------------------+---------+---------+----------+------------------ |
| 197 (Initial file) | 624 | 717 | 1.00 | - | 197 (Initial file) | 624 | 717 | 1.00 | - |
| 198 | | | | | 198 | | | | |
| 199 Blind fuzzing | 1,101 | 1,409 | 1.60 | 0 | 199 Blind fuzzing | 1,101 | 1,409 | 1.60 | 0 |
| 200 Block coverage | 1,255 | 1,649 | 1.48 | 0 | 200 Block coverage | 1,255 | 1,649 | 1.48 | 0 |
| 201 Edge coverage | 1,259 | 1,734 | 1.72 | 0 | 201 Edge coverage | 1,259 | 1,734 | 1.72 | 0 |
| 202 AFL model | 1,452 | 2,040 | 3.16 | 1 | 202 AFL model | 1,452 | 2,040 | 3.16 | 1 |
| 203 | 203 |
| 204 Some of the earlier work on evolutionary fuzzing suggested maintaining just a | 204 At noted earlier on, some of the prior work on genetic fuzzing relied on |
| 205 single test case and selecting for mutations that improve coverage. At least | 205 maintaining a single test case and evolving it to maximize coverage. At least |
| 206 in the tests described above, this "greedy" method appeared to offer no | 206 in the tests described above, this "greedy" approach appears to confer no |
| 207 substantial benefits over blind fuzzing. | 207 substantial benefits over blind fuzzing strategies. |
| 208 | 208 |
| 209 4) Culling the corpus | 209 4) Culling the corpus |
| 210 --------------------- | 210 --------------------- |
| 211 | 211 |
| 212 The progressive state exploration approach outlined above means that some of | 212 The progressive state exploration approach outlined above means that some of |
| 213 the test cases synthesized later on in the game may have edge coverage that | 213 the test cases synthesized later on in the game may have edge coverage that |
| 214 is a strict superset of the coverage provided by their ancestors. | 214 is a strict superset of the coverage provided by their ancestors. |
| 215 | 215 |
| 216 To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a | 216 To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a |
| 217 fast algorithm that selects a smaller subset of test cases that still cover | 217 fast algorithm that selects a smaller subset of test cases that still cover |
| (...skipping 38 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 256 external tools. | 256 external tools. |
| 257 | 257 |
| 258 5) Trimming input files | 258 5) Trimming input files |
| 259 ----------------------- | 259 ----------------------- |
| 260 | 260 |
| 261 File size has a dramatic impact on fuzzing performance, both because large | 261 File size has a dramatic impact on fuzzing performance, both because large |
| 262 files make the target binary slower, and because they reduce the likelihood | 262 files make the target binary slower, and because they reduce the likelihood |
| 263 that a mutation would touch important format control structures, rather than | 263 that a mutation would touch important format control structures, rather than |
| 264 redundant data blocks. This is discussed in more detail in perf_tips.txt. | 264 redundant data blocks. This is discussed in more detail in perf_tips.txt. |
| 265 | 265 |
| 266 The possibility of a bad starting corpus provided by the user aside, some | 266 The possibility that the user will provide a low-quality starting corpus aside, |
| 267 types of mutations can have the effect of iteratively increasing the size of | 267 some types of mutations can have the effect of iteratively increasing the size |
| 268 the generated files, so it is important to counter this trend. | 268 of the generated files, so it is important to counter this trend. |
| 269 | 269 |
| 270 Luckily, the instrumentation feedback provides a simple way to automatically | 270 Luckily, the instrumentation feedback provides a simple way to automatically |
| 271 trim down input files while ensuring that the changes made to the files have no | 271 trim down input files while ensuring that the changes made to the files have no |
| 272 impact on the execution path. | 272 impact on the execution path. |
| 273 | 273 |
| 274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data | 274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data |
| 275 with variable length and stepover; any deletion that doesn't affect the checksum | 275 with variable length and stepover; any deletion that doesn't affect the checksum |
| 276 of the trace map is committed to disk. The trimmer is not designed to be | 276 of the trace map is committed to disk. The trimmer is not designed to be |
| 277 particularly thorough; instead, it tries to strike a balance between precision | 277 particularly thorough; instead, it tries to strike a balance between precision |
| 278 and the number of execve() calls spent on the process. The average per-file | 278 and the number of execve() calls spent on the process, selecting the block size |
| 279 gains are around 5-20%. | 279 and stepover to match. The average per-file gains are around 5-20%. |
| 280 | 280 |
| 281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and | 281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and |
| 282 also attempts to perform alphabet normalization on the trimmed files. | 282 also attempts to perform alphabet normalization on the trimmed files. |
| 283 | 283 |
| 284 6) Fuzzing strategies | 284 6) Fuzzing strategies |
| 285 --------------------- | 285 --------------------- |
| 286 | 286 |
| 287 The feedback provided by the instrumentation makes it easy to understand the | 287 The feedback provided by the instrumentation makes it easy to understand the |
| 288 value of various fuzzing strategies and optimize their parameters so that they | 288 value of various fuzzing strategies and optimize their parameters so that they |
| 289 work equally well across a wide range of file types. The strategies used by | 289 work equally well across a wide range of file types. The strategies used by |
| 290 afl-fuzz are generally format-agnostic and are discussed in more detail here: | 290 afl-fuzz are generally format-agnostic and are discussed in more detail here: |
| 291 | 291 |
| 292 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html | 292 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html |
| 293 | 293 |
| 294 It is somewhat notable that especially early on, most of the work done by | 294 It is somewhat notable that especially early on, most of the work done by |
| 295 afl-fuzz is actually highly deterministic, and progresses to random stacked | 295 afl-fuzz is actually highly deterministic, and progresses to random stacked |
| 296 modifications and test case splicing only at a later stage. The deterministic | 296 modifications and test case splicing only at a later stage. The deterministic |
| 297 strategies include: | 297 strategies include: |
| 298 | 298 |
| 299 - Sequential bit flips with varying lengths and stepovers, | 299 - Sequential bit flips with varying lengths and stepovers, |
| 300 | 300 |
| 301 - Sequential addition and subtraction of small integers, | 301 - Sequential addition and subtraction of small integers, |
| 302 | 302 |
| 303 - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), | 303 - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), |
| 304 | 304 |
| 305 The non-deterministic steps include stacked bit flips, insertions, deletions, | 305 The purpose of opening with deterministic steps is related to their tendency to |
| 306 arithmetics, and splicing of different test cases. | 306 produce compact test cases and small diffs between the non-crashing and crashing |
| 307 inputs. |
| 307 | 308 |
| 308 Their relative yields and execve() costs have been investigated and are | 309 With deterministic fuzzing out of the way, the non-deterministic steps include |
| 309 discussed in the aforementioned blog post. | 310 stacked bit flips, insertions, deletions, arithmetics, and splicing of different |
| 311 test cases. |
| 312 |
| 313 The relative yields and execve() costs of all these strategies have been |
| 314 investigated and are discussed in the aforementioned blog post. |
| 310 | 315 |
| 311 For the reasons discussed in historical_notes.txt (chiefly, performance, | 316 For the reasons discussed in historical_notes.txt (chiefly, performance, |
| 312 simplicity, and reliability), AFL generally does not try to reason about the | 317 simplicity, and reliability), AFL generally does not try to reason about the |
| 313 relationship between specific mutations and program states; the fuzzing steps | 318 relationship between specific mutations and program states; the fuzzing steps |
| 314 are nominally blind, and are guided only by the evolutionary design of the | 319 are nominally blind, and are guided only by the evolutionary design of the |
| 315 input queue. | 320 input queue. |
| 316 | 321 |
| 317 That said, there is one (trivial) exception to this rule: when a new queue | 322 That said, there is one (trivial) exception to this rule: when a new queue |
| 318 entry goes through the initial set of deterministic fuzzing steps, and some | 323 entry goes through the initial set of deterministic fuzzing steps, and tweaks to |
| 319 regions in the file are observed to have no effect on the checksum of the | 324 some regions in the file are observed to have no effect on the checksum of the |
| 320 execution path, they may be excluded from the remaining phases of | 325 execution path, they may be excluded from the remaining phases of |
| 321 deterministic fuzzing - and proceed straight to random tweaks. Especially for | 326 deterministic fuzzing - and the fuzzer may proceed straight to random tweaks. |
| 322 verbose, human-readable data formats, this can reduce the number of execs by | 327 Especially for verbose, human-readable data formats, this can reduce the number |
| 323 10-40% or so without an appreciable drop in coverage. In extreme cases, such | 328 of execs by 10-40% or so without an appreciable drop in coverage. In extreme |
| 324 as normally block-aligned tar archives, the gains can be as high as 90%. | 329 cases, such as normally block-aligned tar archives, the gains can be as high as |
| 330 90%. |
| 325 | 331 |
| 326 Because the underlying "effector maps" are local every queue entry and remain | 332 Because the underlying "effector maps" are local every queue entry and remain |
| 327 in force only during deterministic stages that do not alter the size or the | 333 in force only during deterministic stages that do not alter the size or the |
| 328 general layout of the underlying file, this mechanism appears to work very | 334 general layout of the underlying file, this mechanism appears to work very |
| 329 reliably and proved to be simple to implement. | 335 reliably and proved to be simple to implement. |
| 330 | 336 |
| 331 7) Dictionaries | 337 7) Dictionaries |
| 332 --------------- | 338 --------------- |
| 333 | 339 |
| 334 The feedback provided by the instrumentation makes it easy to automatically | 340 The feedback provided by the instrumentation makes it easy to automatically |
| (...skipping 11 matching lines...) Expand all Loading... |
| 346 design of the queue together provide a feedback mechanism to differentiate | 352 design of the queue together provide a feedback mechanism to differentiate |
| 347 between meaningless mutations and ones that trigger new behaviors in the | 353 between meaningless mutations and ones that trigger new behaviors in the |
| 348 instrumented code - and to incrementally build more complex syntax on top of | 354 instrumented code - and to incrementally build more complex syntax on top of |
| 349 this discovery. | 355 this discovery. |
| 350 | 356 |
| 351 The dictionaries have been shown to enable the fuzzer to rapidly reconstruct | 357 The dictionaries have been shown to enable the fuzzer to rapidly reconstruct |
| 352 the grammar of highly verbose and complex languages such as JavaScript, SQL, | 358 the grammar of highly verbose and complex languages such as JavaScript, SQL, |
| 353 or XML; several examples of generated SQL statements are given in the blog | 359 or XML; several examples of generated SQL statements are given in the blog |
| 354 post mentioned above. | 360 post mentioned above. |
| 355 | 361 |
| 362 Interestingly, the AFL instrumentation also allows the fuzzer to automatically |
| 363 isolate syntax tokens already present in an input file. It can do so by looking |
| 364 for run of bytes that, when flipped, produce a consistent change to the |
| 365 program's execution path; this is suggestive of an underlying atomic comparison |
| 366 to a predefined value baked into the code. The fuzzer relies on this signal |
| 367 to build compact "auto dictionaries" that are then used in conjunction with |
| 368 other fuzzing strategies. |
| 369 |
| 356 8) De-duping crashes | 370 8) De-duping crashes |
| 357 -------------------- | 371 -------------------- |
| 358 | 372 |
| 359 De-duplication of crashes is one of the more important problems for any | 373 De-duplication of crashes is one of the more important problems for any |
| 360 competent fuzzing tool. Many of the naive approaches run into problems; in | 374 competent fuzzing tool. Many of the naive approaches run into problems; in |
| 361 particular, looking just at the faulting address may lead to completely | 375 particular, looking just at the faulting address may lead to completely |
| 362 unrelated issues being clustered together if the fault happens in a common | 376 unrelated issues being clustered together if the fault happens in a common |
| 363 library function (say, strcmp, strcpy); while checksumming call stack | 377 library function (say, strcmp, strcpy); while checksumming call stack |
| 364 backtraces can lead to extreme crash count inflation if the fault can be | 378 backtraces can lead to extreme crash count inflation if the fault can be |
| 365 reached through a number of different, possibly recursive code paths. | 379 reached through a number of different, possibly recursive code paths. |
| (...skipping 43 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 409 http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html | 423 http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html |
| 410 | 424 |
| 411 The fork server is an integral aspect of the injected instrumentation and | 425 The fork server is an integral aspect of the injected instrumentation and |
| 412 simply stops at the first instrumented function to await commands from | 426 simply stops at the first instrumented function to await commands from |
| 413 afl-fuzz. | 427 afl-fuzz. |
| 414 | 428 |
| 415 With fast targets, the fork server can offer considerable performance gains, | 429 With fast targets, the fork server can offer considerable performance gains, |
| 416 usually between 1.5x and 2x. It is also possible to: | 430 usually between 1.5x and 2x. It is also possible to: |
| 417 | 431 |
| 418 - Use the fork server in manual ("deferred") mode, skipping over larger, | 432 - Use the fork server in manual ("deferred") mode, skipping over larger, |
| 419 user-selected chunks of initialization code. With some targets, this can | 433 user-selected chunks of initialization code. It requires very modest |
| 434 code changes to the targeted program, and With some targets, can |
| 420 produce 10x+ performance gains. | 435 produce 10x+ performance gains. |
| 421 | 436 |
| 422 - Enable "persistent" mode, where a single process is used to try out | 437 - Enable "persistent" mode, where a single process is used to try out |
| 423 multiple inputs, greatly limiting the overhead of repetitive fork() | 438 multiple inputs, greatly limiting the overhead of repetitive fork() |
| 424 calls. As with the previous mode, this requires custom modifications, | 439 calls. This generally requires some code changes to the targeted program, |
| 425 but can improve the performance of fast targets by a factor of 5 or more | 440 but can improve the performance of fast targets by a factor of 5 or more |
| 426 - approximating the benefits of in-process fuzzing jobs. | 441 - approximating the benefits of in-process fuzzing jobs while still |
| 442 maintaining very robust isolation between the fuzzer process and the |
| 443 targeted binary. |
| 427 | 444 |
| 428 11) Parallelization | 445 11) Parallelization |
| 429 ------------------- | 446 ------------------- |
| 430 | 447 |
| 431 The parallelization mechanism relies on periodically examining the queues | 448 The parallelization mechanism relies on periodically examining the queues |
| 432 produced by independently-running instances on other CPU cores or on remote | 449 produced by independently-running instances on other CPU cores or on remote |
| 433 machines, and then selectively pulling in the test cases that produce behaviors | 450 machines, and then selectively pulling in the test cases that, when tried |
| 434 not yet seen by the fuzzer at hand. | 451 out locally, produce behaviors not yet seen by the fuzzer at hand. |
| 435 | 452 |
| 436 This allows for extreme flexibility in fuzzer setup, including running synced | 453 This allows for extreme flexibility in fuzzer setup, including running synced |
| 437 instances against different parsers of a common data format, often with | 454 instances against different parsers of a common data format, often with |
| 438 synergistic effects. | 455 synergistic effects. |
| 439 | 456 |
| 440 For more information about this design, see parallel_fuzzing.txt. | 457 For more information about this design, see parallel_fuzzing.txt. |
| 441 | 458 |
| 442 12) Binary-only instrumentation | 459 12) Binary-only instrumentation |
| 443 ------------------------------- | 460 ------------------------------- |
| 444 | 461 |
| (...skipping 61 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 506 unlikely. This is suggestive of a checksum or other "magic" integer. | 523 unlikely. This is suggestive of a checksum or other "magic" integer. |
| 507 | 524 |
| 508 - "Suspected checksummed block" - a long block of data where any change | 525 - "Suspected checksummed block" - a long block of data where any change |
| 509 always triggers the same new execution path. Likely caused by failing | 526 always triggers the same new execution path. Likely caused by failing |
| 510 a checksum or a similar integrity check before any subsequent parsing | 527 a checksum or a similar integrity check before any subsequent parsing |
| 511 takes place. | 528 takes place. |
| 512 | 529 |
| 513 - "Magic value section" - a generic token where changes cause the type | 530 - "Magic value section" - a generic token where changes cause the type |
| 514 of binary behavior outlined earlier, but that doesn't meet any of the | 531 of binary behavior outlined earlier, but that doesn't meet any of the |
| 515 other criteria. May be an atomically compared keyword or so. | 532 other criteria. May be an atomically compared keyword or so. |
| OLD | NEW |