Chromium Code Reviews| OLD | NEW |
|---|---|
| 1 # Efficient Fuzzer | 1 # Efficient Fuzzer |
| 2 | 2 |
| 3 This document describes ways to determine your fuzzer efficiency and ways | 3 This document describes ways to determine your fuzzer efficiency and ways |
| 4 to improve it. | 4 to improve it. |
| 5 | 5 |
| 6 ## Overview | 6 ## Overview |
| 7 | 7 |
| 8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting * | 8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting * |
| 9 if it results in new coverage. The set of all interesting inputs is called | 9 if it results in new coverage. The set of all interesting inputs is called |
| 10 *corpus*. | 10 *corpus*. |
| (...skipping 14 matching lines...) Expand all Loading... | |
| 25 Fuzzer speed is printed while fuzzer runs: | 25 Fuzzer speed is printed while fuzzer runs: |
| 26 | 26 |
| 27 ``` | 27 ``` |
| 28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62 | 28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62 |
| 29 ``` | 29 ``` |
| 30 | 30 |
| 31 Because libFuzzer performs randomized search, it is critical to have it as fast | 31 Because libFuzzer performs randomized search, it is critical to have it as fast |
| 32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer | 32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer |
| 33 using any standard tool to see where it spends its time. | 33 using any standard tool to see where it spends its time. |
| 34 | 34 |
| 35 Avoid allocation of dynamic memory whereever is possible. Instrumentation works | |
|
Oliver Chang
2016/04/06 17:13:44
nit: s/whereever/wherever/
nit: no need for the "i
mmoroz
2016/04/06 17:19:20
Done.
| |
| 36 faster for stack-based and static objects than for heap allocated ones. | |
| 37 | |
| 38 Experiment with different values of `-max_len` parameter. This parameter often | |
| 39 significantly affects execution speed, but not always. | |
| 40 | |
| 41 1) Define which `-max_len` value is reasonable for your target. For example, it | |
| 42 may be useless to fuzz an image decoder with too small value for `-max_len`. | |
| 43 2) Increase the value defined on previous step. Check its influence on execution | |
| 44 speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine | |
| 45 to have some bigger value for `-max_len`. | |
| 46 | |
| 47 In general, bigger `-max_len` value gives better coverage. Coverage is main | |
| 48 priority for fuzzing. However, low execution speed may result in waste of | |
| 49 resources used for fuzzing. If large inputs make fuzzer too slow you have to | |
| 50 adjust value of `-max_len` and find a trade-off between coverage and execution | |
| 51 speed. | |
| 52 | |
| 35 ### Initialization/Cleanup | 53 ### Initialization/Cleanup |
| 36 | 54 |
| 37 Try to keep your fuzzing function as simple as possible. Prefer to use static | 55 Try to keep your fuzzing function as simple as possible. Prefer to use static |
| 38 initialization and shared resources rather than bringing environment up and down | 56 initialization and shared resources rather than bringing environment up and down |
| 39 every single run. | 57 every single run. |
| 40 | 58 |
| 41 Fuzzers don't have to shutdown gracefully (we either kill them or they crash | 59 Fuzzers don't have to shutdown gracefully (we either kill them or they crash |
| 42 because sanitizer has found a problem). You can skip freeing static resource. | 60 because sanitizer has found a problem). You can skip freeing static resource. |
| 43 | 61 |
| 44 Of course all resources allocated withing `LLVMFuzzerTestOneInput` function | 62 Of course all resources allocated within `LLVMFuzzerTestOneInput` function |
| 45 should be deallocated since this function is called millions of times during | 63 should be deallocated since this function is called millions of times during |
| 46 one fuzzing session. | 64 one fuzzing session. |
| 47 | 65 |
| 48 | 66 |
| 49 ## Corpus Size | 67 ## Corpus Size |
| 50 | 68 |
| 51 After running for a while the fuzzer would reach a plateau and won't discover | 69 After running for a while the fuzzer would reach a plateau and won't discover |
| 52 new interesting input. Corpus for a reasonably complex functionality | 70 new interesting input. Corpus for a reasonably complex functionality |
| 53 should contain hundreds (if not thousands) of items. | 71 should contain hundreds (if not thousands) of items. |
| 54 | 72 |
| 55 Too small corpus size indicates some code barrier that | 73 Too small corpus size indicates some code barrier that |
| 56 libFuzzer is having problems penetrating. Common cases include: checksums, | 74 libFuzzer is having problems penetrating. Common cases include: checksums, |
| 57 magic numbers etc. The easiest way to diagnose this problem is to generate a | 75 magic numbers etc. The easiest way to diagnose this problem is to generate a |
| 58 [coverage report](#Coverage). To fix the issue you can: | 76 [coverage report](#Coverage). To fix the issue you can: |
| 59 | 77 |
| 60 * change the code (e.g. disable crc checks while fuzzing) | 78 * change the code (e.g. disable crc checks while fuzzing) |
| 61 * prepare [corpus seed](#Corpus-Seed). | 79 * prepare [corpus seed](#Corpus-Seed) |
| 62 * prepare [fuzzer dictionary](#Fuzzer-Dictionary) | 80 * prepare [fuzzer dictionary](#Fuzzer-Dictionary) |
| 63 | 81 |
| 64 ## Coverage | 82 ## Coverage |
| 65 | 83 |
| 66 You can easily generate source-level coverage report for a given corpus: | 84 You can easily generate source-level coverage report for a given corpus: |
| 67 | 85 |
| 68 ``` | 86 ``` |
| 69 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \ | 87 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \ |
| 70 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus | 88 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus |
| 71 ``` | 89 ``` |
| 72 | 90 |
| 73 This will produce an .html file with colored source-code. It can be used to | 91 This will produce an .html file with colored source-code. It can be used to |
| 74 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding | 92 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding |
| 75 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`). | 93 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`). |
| 76 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment | 94 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment |
| 77 variable. | 95 variable. |
| 78 | 96 |
| 79 ## Corpus Seed | 97 ## Corpus Seed |
| 80 | 98 |
| 81 You can pass a corpus directory to a fuzzer that you run manually: | 99 You can pass a corpus directory to a fuzzer that you run manually: |
| 82 | 100 |
| 83 ``` | 101 ``` |
| 84 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus | 102 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus |
| 85 ``` | 103 ``` |
| 86 | 104 |
| 87 The directory can initially be empty. The fuzzer would store all the interesting | 105 The directory can initially be empty. The fuzzer would store all the interesting |
| 88 items it finds in the directory. You can help the fuzzer by "seeding" the corpus : | 106 items it finds in the directory. You can help the fuzzer by "seeding" the corpus : |
| 89 simply copy interesting inputs for your function to the corpus directory before | 107 simply copy interesting inputs for your function to the corpus directory before |
| 90 running. This works especially well for file-parsing functionality: just | 108 running. This works especially well for strictly defined file formats or data |
| 91 use some valid files from your test suite. | 109 transmission protocols. |
| 110 * For file-parsing functionality just use some valid files from your test suite. | |
| 111 * For protocol processing targets put raw streams from test suite into separate | |
| 112 files. | |
| 92 | 113 |
| 93 After discovering new and interesting items, [upload corpus to ClusterFuzz]. | 114 After discovering new and interesting items, [upload corpus to ClusterFuzz]. |
| 94 | 115 |
| 95 ## Fuzzer Dictionary | 116 ## Fuzzer Dictionary |
| 96 | 117 |
| 97 It is very useful to provide fuzzer a set of common words/values that you expect | 118 It is very useful to provide fuzzer a set of common words/values that you expect |
| 98 to find in the input. This greatly improves efficiency of finding new units and | 119 to find in the input. This greatly improves efficiency of finding new units and |
| 99 works especially well while fuzzing file format decoders. | 120 works especially well while fuzzing file format decoders. |
| 100 | 121 |
| 101 To add a dictionary, first create a dictionary file. | 122 To add a dictionary, first create a dictionary file. |
| (...skipping 29 matching lines...) Expand all Loading... | |
| 131 } | 152 } |
| 132 ``` | 153 ``` |
| 133 | 154 |
| 134 Make sure to submit dictionary file to git. The dictionary will be used | 155 Make sure to submit dictionary file to git. The dictionary will be used |
| 135 automatically by ClusterFuzz once it picks up new fuzzer version (once a day). | 156 automatically by ClusterFuzz once it picks up new fuzzer version (once a day). |
| 136 | 157 |
| 137 | 158 |
| 138 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links | 159 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links |
| 139 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus | 160 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus |
| 140 [AFL]: http://lcamtuf.coredump.cx/afl/ | 161 [AFL]: http://lcamtuf.coredump.cx/afl/ |
| OLD | NEW |