| OLD | NEW |
| 1 # Efficient Fuzzer | 1 # Efficient Fuzzer |
| 2 | 2 |
| 3 This document describes ways to determine your fuzzer efficiency and ways | 3 This document describes ways to determine your fuzzer efficiency and ways |
| 4 to improve it. | 4 to improve it. |
| 5 | 5 |
| 6 ## Overview | 6 ## Overview |
| 7 | 7 |
| 8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting
* | 8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting
* |
| 9 if it results in new coverage. The set of all interesting inputs is called | 9 if it results in new coverage. The set of all interesting inputs is called |
| 10 *corpus*. | 10 *corpus*. |
| 11 Items in corpus are constantly mutated in search of new interesting input. | 11 Items in corpus are constantly mutated in search of new interesting input. |
| 12 Corpus is usually maintained between multiple fuzzer runs. | 12 Corpus is usually maintained between multiple fuzzer runs. |
| 13 | 13 |
| 14 There are several metrics you should look at to determine your fuzzer effectiven
ess: | 14 There are several metrics you should look at to determine your fuzzer effectiven
ess: |
| 15 | 15 |
| 16 * fuzzer speed (exec/s) | 16 * [fuzzer speed](#Fuzzer-Speed) (exec/s) |
| 17 * corpus size | 17 * [corpus size](#Corpus-Size) |
| 18 * coverage | 18 * [coverage](#Coverage) |
| 19 | 19 |
| 20 You can collect these metrics manually or take them from [ClusterFuzz status] | 20 You can collect these metrics manually or take them from [ClusterFuzz status] |
| 21 pages. | 21 pages. |
| 22 | 22 |
| 23 ## Fuzzer Speed | 23 ## Fuzzer Speed |
| 24 | 24 |
| 25 Fuzzer speed is printed while fuzzer runs: | 25 Fuzzer speed is printed while fuzzer runs: |
| 26 | 26 |
| 27 ``` | 27 ``` |
| 28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62 | 28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62 |
| 29 ``` | 29 ``` |
| 30 | 30 |
| 31 Because libFuzzer performs randomized search, it is critical to have it as fast | 31 Because libFuzzer performs randomized search, it is critical to have it as fast |
| 32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer | 32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer |
| 33 using any standard tool to see where it spends its time. | 33 using any standard tool to see where it spends its time. |
| 34 | 34 |
| 35 |
| 35 ### Initialization/Cleanup | 36 ### Initialization/Cleanup |
| 36 | 37 |
| 37 Try to keep your fuzzing function as simple as possible. Prefer to use static | 38 Try to keep your fuzzing function as simple as possible. Prefer to use static |
| 38 initialization and shared resources rather than bringing environment up and down | 39 initialization and shared resources rather than bringing environment up and down |
| 39 every single run. | 40 every single run. |
| 40 | 41 |
| 41 Fuzzers don't have to shutdown gracefully (we either kill them or they crash | 42 Fuzzers don't have to shutdown gracefully (we either kill them or they crash |
| 42 because sanitizer has found a problem). You can skip freeing static resource. | 43 because sanitizer has found a problem). You can skip freeing static resource. |
| 43 | 44 |
| 44 Of course all resources allocated withing `LLVMFuzzerTestOneInput` function | 45 Of course all resources allocated within `LLVMFuzzerTestOneInput` function |
| 45 should be deallocated since this function is called millions of times during | 46 should be deallocated since this function is called millions of times during |
| 46 one fuzzing session. | 47 one fuzzing session. |
| 47 | 48 |
| 48 | 49 |
| 50 ### Memory Usage |
| 51 |
| 52 Avoid allocation of dynamic memory wherever possible. Instrumentation works |
| 53 faster for stack-based and static objects than for heap allocated ones. |
| 54 |
| 55 It is always a good idea to play with different versions of a fuzzer to find the |
| 56 fastest implementation. |
| 57 |
| 58 |
| 59 ### Maximum Testcase Length |
| 60 |
| 61 Experiment with different values of `-max_len` parameter. This parameter often |
| 62 significantly affects execution speed, but not always. |
| 63 |
| 64 1) Define which `-max_len` value is reasonable for your target. For example, it |
| 65 may be useless to fuzz an image decoder with too small value of testcase length. |
| 66 |
| 67 2) Increase the value defined on previous step. Check its influence on execution |
| 68 speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine |
| 69 to have some bigger value for `-max_len`. |
| 70 |
| 71 In general, bigger `-max_len` value gives better coverage. Coverage is main |
| 72 priority for fuzzing. However, low execution speed may result in waste of |
| 73 resources used for fuzzing. If large inputs make fuzzer too slow you have to |
| 74 adjust value of `-max_len` and find a trade-off between coverage and execution |
| 75 speed. |
| 76 |
| 77 |
| 49 ## Corpus Size | 78 ## Corpus Size |
| 50 | 79 |
| 51 After running for a while the fuzzer would reach a plateau and won't discover | 80 After running for a while the fuzzer would reach a plateau and won't discover |
| 52 new interesting input. Corpus for a reasonably complex functionality | 81 new interesting input. Corpus for a reasonably complex functionality |
| 53 should contain hundreds (if not thousands) of items. | 82 should contain hundreds (if not thousands) of items. |
| 54 | 83 |
| 55 Too small corpus size indicates some code barrier that | 84 Too small corpus size indicates some code barrier that |
| 56 libFuzzer is having problems penetrating. Common cases include: checksums, | 85 libFuzzer is having problems penetrating. Common cases include: checksums, |
| 57 magic numbers etc. The easiest way to diagnose this problem is to generate a | 86 magic numbers etc. The easiest way to diagnose this problem is to generate a |
| 58 [coverage report](#Coverage). To fix the issue you can: | 87 [coverage report](#Coverage). To fix the issue you can: |
| 59 | 88 |
| 60 * change the code (e.g. disable crc checks while fuzzing) | 89 * change the code (e.g. disable crc checks while fuzzing) |
| 61 * prepare [corpus seed](#Corpus-Seed). | 90 * prepare [corpus seed](#Corpus-Seed) |
| 62 * prepare [fuzzer dictionary](#Fuzzer-Dictionary) | 91 * prepare [fuzzer dictionary](#Fuzzer-Dictionary) |
| 63 | 92 |
| 64 ## Coverage | 93 ## Coverage |
| 65 | 94 |
| 66 You can easily generate source-level coverage report for a given corpus: | 95 You can easily generate source-level coverage report for a given corpus: |
| 67 | 96 |
| 68 ``` | 97 ``` |
| 69 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse
rts/bin/sancov \ | 98 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse
rts/bin/sancov \ |
| 70 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus | 99 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus |
| 71 ``` | 100 ``` |
| 72 | 101 |
| 73 This will produce an .html file with colored source-code. It can be used to | 102 This will produce an .html file with colored source-code. It can be used to |
| 74 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding | 103 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding |
| 75 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`). | 104 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`). |
| 76 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment | 105 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment |
| 77 variable. | 106 variable. |
| 78 | 107 |
| 79 ## Corpus Seed | 108 ### Corpus Seed |
| 80 | 109 |
| 81 You can pass a corpus directory to a fuzzer that you run manually: | 110 You can pass a corpus directory to a fuzzer that you run manually: |
| 82 | 111 |
| 83 ``` | 112 ``` |
| 84 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus | 113 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus |
| 85 ``` | 114 ``` |
| 86 | 115 |
| 87 The directory can initially be empty. The fuzzer would store all the interesting | 116 The directory can initially be empty. The fuzzer would store all the interesting |
| 88 items it finds in the directory. You can help the fuzzer by "seeding" the corpus
: | 117 items it finds in the directory. You can help the fuzzer by "seeding" the corpus
: |
| 89 simply copy interesting inputs for your function to the corpus directory before | 118 simply copy interesting inputs for your function to the corpus directory before |
| 90 running. This works especially well for file-parsing functionality: just | 119 running. This works especially well for strictly defined file formats or data |
| 91 use some valid files from your test suite. | 120 transmission protocols. |
| 121 * For file-parsing functionality just use some valid files from your test suite. |
| 122 * For protocol processing targets put raw streams from test suite into separate |
| 123 files. |
| 92 | 124 |
| 93 After discovering new and interesting items, [upload corpus to ClusterFuzz]. | 125 After discovering new and interesting items, [upload corpus to ClusterFuzz]. |
| 94 | 126 |
| 95 ## Fuzzer Dictionary | 127 ### Fuzzer Dictionary |
| 96 | 128 |
| 97 It is very useful to provide fuzzer a set of common words/values that you expect | 129 It is very useful to provide fuzzer a set of common words/values that you expect |
| 98 to find in the input. This greatly improves efficiency of finding new units and | 130 to find in the input. This greatly improves efficiency of finding new units and |
| 99 works especially well while fuzzing file format decoders. | 131 works especially well while fuzzing file format decoders. |
| 100 | 132 |
| 101 To add a dictionary, first create a dictionary file. | 133 To add a dictionary, first create a dictionary file. |
| 102 Dictionary syntax is similar to that used by [AFL] for its -x option: | 134 Dictionary syntax is similar to that used by [AFL] for its -x option: |
| 103 | 135 |
| 104 ``` | 136 ``` |
| 105 # Lines starting with '#' and empty lines are ignored. | 137 # Lines starting with '#' and empty lines are ignored. |
| (...skipping 25 matching lines...) Expand all Loading... |
| 131 } | 163 } |
| 132 ``` | 164 ``` |
| 133 | 165 |
| 134 Make sure to submit dictionary file to git. The dictionary will be used | 166 Make sure to submit dictionary file to git. The dictionary will be used |
| 135 automatically by ClusterFuzz once it picks up new fuzzer version (once a day). | 167 automatically by ClusterFuzz once it picks up new fuzzer version (once a day). |
| 136 | 168 |
| 137 | 169 |
| 138 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links | 170 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links |
| 139 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus | 171 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus |
| 140 [AFL]: http://lcamtuf.coredump.cx/afl/ | 172 [AFL]: http://lcamtuf.coredump.cx/afl/ |
| OLD | NEW |