OLD | NEW |
1 =================================== | 1 =================================== |
2 Technical "whitepaper" for afl-fuzz | 2 Technical "whitepaper" for afl-fuzz |
3 =================================== | 3 =================================== |
4 | 4 |
5 This document provides a quick overview of the guts of American Fuzzy Lop. | 5 This document provides a quick overview of the guts of American Fuzzy Lop. |
6 See README for the general instruction manual; and for a discussion of | 6 See README for the general instruction manual; and for a discussion of |
7 motivations and design goals behind AFL, see historical_notes.txt. | 7 motivations and design goals behind AFL, see historical_notes.txt. |
8 | 8 |
9 0) Design statement | 9 0) Design statement |
10 ------------------- | 10 ------------------- |
(...skipping 261 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
272 impact on the execution path. | 272 impact on the execution path. |
273 | 273 |
274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data | 274 The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data |
275 with variable length and stepover; any deletion that doesn't affect the checksum | 275 with variable length and stepover; any deletion that doesn't affect the checksum |
276 of the trace map is committed to disk. The trimmer is not designed to be | 276 of the trace map is committed to disk. The trimmer is not designed to be |
277 particularly thorough; instead, it tries to strike a balance between precision | 277 particularly thorough; instead, it tries to strike a balance between precision |
278 and the number of execve() calls spent on the process, selecting the block size | 278 and the number of execve() calls spent on the process, selecting the block size |
279 and stepover to match. The average per-file gains are around 5-20%. | 279 and stepover to match. The average per-file gains are around 5-20%. |
280 | 280 |
281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and | 281 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and |
282 also attempts to perform alphabet normalization on the trimmed files. | 282 also attempts to perform alphabet normalization on the trimmed files. The |
| 283 operation of afl-tmin is as follows. |
| 284 |
| 285 First, the tool automatically selects the operating mode. If the initial input |
| 286 crashes the target binary, afl-tmin will run in non-instrumented mode, simply |
| 287 keeping any tweaks that produce a simpler file but still crash the target. If |
| 288 the target is non-crashing, the tool uses an instrumented mode and keeps only |
| 289 the tweaks that produce exactly the same execution path. |
| 290 |
| 291 The actual minimization algorithm is: |
| 292 |
| 293 1) Attempt to zero large blocks of data with large stepovers. Empirically, |
| 294 this is shown to reduce the number of execs by preempting finer-grained |
| 295 efforts later on. |
| 296 |
| 297 2) Perform a block deletion pass with decreasing block sizes and stepovers, |
| 298 binary-search-style. |
| 299 |
| 300 3) Perform alphabet normalization by counting unique characters and trying |
| 301 to bulk-replace each with a zero value. |
| 302 |
| 303 4) As a last result, perform byte-by-byte normalization on non-zero bytes. |
| 304 |
| 305 Instead of zeroing with a 0x00 byte, afl-tmin uses the ASCII digit '0'. This |
| 306 is done because such a modification is much less likely to interfere with |
| 307 text parsing, so it is more likely to result in successful minimization of |
| 308 text files. |
| 309 |
| 310 The algorithm used here is less involved than some other test case |
| 311 minimization approaches proposed in academic work, but requires far fewer |
| 312 executions and tends to produce comparable results in most real-world |
| 313 applications. |
283 | 314 |
284 6) Fuzzing strategies | 315 6) Fuzzing strategies |
285 --------------------- | 316 --------------------- |
286 | 317 |
287 The feedback provided by the instrumentation makes it easy to understand the | 318 The feedback provided by the instrumentation makes it easy to understand the |
288 value of various fuzzing strategies and optimize their parameters so that they | 319 value of various fuzzing strategies and optimize their parameters so that they |
289 work equally well across a wide range of file types. The strategies used by | 320 work equally well across a wide range of file types. The strategies used by |
290 afl-fuzz are generally format-agnostic and are discussed in more detail here: | 321 afl-fuzz are generally format-agnostic and are discussed in more detail here: |
291 | 322 |
292 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html | 323 http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html |
(...skipping 230 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
523 unlikely. This is suggestive of a checksum or other "magic" integer. | 554 unlikely. This is suggestive of a checksum or other "magic" integer. |
524 | 555 |
525 - "Suspected checksummed block" - a long block of data where any change | 556 - "Suspected checksummed block" - a long block of data where any change |
526 always triggers the same new execution path. Likely caused by failing | 557 always triggers the same new execution path. Likely caused by failing |
527 a checksum or a similar integrity check before any subsequent parsing | 558 a checksum or a similar integrity check before any subsequent parsing |
528 takes place. | 559 takes place. |
529 | 560 |
530 - "Magic value section" - a generic token where changes cause the type | 561 - "Magic value section" - a generic token where changes cause the type |
531 of binary behavior outlined earlier, but that doesn't meet any of the | 562 of binary behavior outlined earlier, but that doesn't meet any of the |
532 other criteria. May be an atomically compared keyword or so. | 563 other criteria. May be an atomically compared keyword or so. |
OLD | NEW |