Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(1093)

Side by Side Diff: testing/libfuzzer/efficient_fuzzer.md

Issue 1855373008: [libfuzzer] update Efficient Fuzzer Guide and small fixes to documentation. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Address comments: more sections and small reordering of them. Created 4 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « testing/libfuzzer/clusterfuzz.md ('k') | testing/libfuzzer/getting_started.md » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # Efficient Fuzzer 1 # Efficient Fuzzer
2 2
3 This document describes ways to determine your fuzzer efficiency and ways 3 This document describes ways to determine your fuzzer efficiency and ways
4 to improve it. 4 to improve it.
5 5
6 ## Overview 6 ## Overview
7 7
8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting * 8 Being a coverage-driven fuzzer, libFuzzer considers a certain input *interesting *
9 if it results in new coverage. The set of all interesting inputs is called 9 if it results in new coverage. The set of all interesting inputs is called
10 *corpus*. 10 *corpus*.
11 Items in corpus are constantly mutated in search of new interesting input. 11 Items in corpus are constantly mutated in search of new interesting input.
12 Corpus is usually maintained between multiple fuzzer runs. 12 Corpus is usually maintained between multiple fuzzer runs.
13 13
14 There are several metrics you should look at to determine your fuzzer effectiven ess: 14 There are several metrics you should look at to determine your fuzzer effectiven ess:
15 15
16 * fuzzer speed (exec/s) 16 * [fuzzer speed](#Fuzzer-Speed) (exec/s)
17 * corpus size 17 * [corpus size](#Corpus-Size)
18 * coverage 18 * [coverage](#Coverage)
19 19
20 You can collect these metrics manually or take them from [ClusterFuzz status] 20 You can collect these metrics manually or take them from [ClusterFuzz status]
21 pages. 21 pages.
22 22
23 ## Fuzzer Speed 23 ## Fuzzer Speed
24 24
25 Fuzzer speed is printed while fuzzer runs: 25 Fuzzer speed is printed while fuzzer runs:
26 26
27 ``` 27 ```
28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62 28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62
29 ``` 29 ```
30 30
31 Because libFuzzer performs randomized search, it is critical to have it as fast 31 Because libFuzzer performs randomized search, it is critical to have it as fast
32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer 32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer
33 using any standard tool to see where it spends its time. 33 using any standard tool to see where it spends its time.
34 34
35
35 ### Initialization/Cleanup 36 ### Initialization/Cleanup
36 37
37 Try to keep your fuzzing function as simple as possible. Prefer to use static 38 Try to keep your fuzzing function as simple as possible. Prefer to use static
38 initialization and shared resources rather than bringing environment up and down 39 initialization and shared resources rather than bringing environment up and down
39 every single run. 40 every single run.
40 41
41 Fuzzers don't have to shutdown gracefully (we either kill them or they crash 42 Fuzzers don't have to shutdown gracefully (we either kill them or they crash
42 because sanitizer has found a problem). You can skip freeing static resource. 43 because sanitizer has found a problem). You can skip freeing static resource.
43 44
44 Of course all resources allocated withing `LLVMFuzzerTestOneInput` function 45 Of course all resources allocated within `LLVMFuzzerTestOneInput` function
45 should be deallocated since this function is called millions of times during 46 should be deallocated since this function is called millions of times during
46 one fuzzing session. 47 one fuzzing session.
47 48
48 49
50 ### Memory Usage
51
52 Avoid allocation of dynamic memory wherever possible. Instrumentation works
53 faster for stack-based and static objects than for heap allocated ones.
54
55 It is always a good idea to play with different versions of a fuzzer to find the
56 fastest implementation.
57
58
59 ### Maximum Testcase Length
60
61 Experiment with different values of `-max_len` parameter. This parameter often
62 significantly affects execution speed, but not always.
63
64 1) Define which `-max_len` value is reasonable for your target. For example, it
65 may be useless to fuzz an image decoder with too small value of testcase length.
66
67 2) Increase the value defined on previous step. Check its influence on execution
68 speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine
69 to have some bigger value for `-max_len`.
70
71 In general, bigger `-max_len` value gives better coverage. Coverage is main
72 priority for fuzzing. However, low execution speed may result in waste of
73 resources used for fuzzing. If large inputs make fuzzer too slow you have to
74 adjust value of `-max_len` and find a trade-off between coverage and execution
75 speed.
76
77
49 ## Corpus Size 78 ## Corpus Size
50 79
51 After running for a while the fuzzer would reach a plateau and won't discover 80 After running for a while the fuzzer would reach a plateau and won't discover
52 new interesting input. Corpus for a reasonably complex functionality 81 new interesting input. Corpus for a reasonably complex functionality
53 should contain hundreds (if not thousands) of items. 82 should contain hundreds (if not thousands) of items.
54 83
55 Too small corpus size indicates some code barrier that 84 Too small corpus size indicates some code barrier that
56 libFuzzer is having problems penetrating. Common cases include: checksums, 85 libFuzzer is having problems penetrating. Common cases include: checksums,
57 magic numbers etc. The easiest way to diagnose this problem is to generate a 86 magic numbers etc. The easiest way to diagnose this problem is to generate a
58 [coverage report](#Coverage). To fix the issue you can: 87 [coverage report](#Coverage). To fix the issue you can:
59 88
60 * change the code (e.g. disable crc checks while fuzzing) 89 * change the code (e.g. disable crc checks while fuzzing)
61 * prepare [corpus seed](#Corpus-Seed). 90 * prepare [corpus seed](#Corpus-Seed)
62 * prepare [fuzzer dictionary](#Fuzzer-Dictionary) 91 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)
63 92
64 ## Coverage 93 ## Coverage
65 94
66 You can easily generate source-level coverage report for a given corpus: 95 You can easily generate source-level coverage report for a given corpus:
67 96
68 ``` 97 ```
69 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \ 98 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \
70 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus 99 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus
71 ``` 100 ```
72 101
73 This will produce an .html file with colored source-code. It can be used to 102 This will produce an .html file with colored source-code. It can be used to
74 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding 103 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding
75 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`). 104 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).
76 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment 105 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment
77 variable. 106 variable.
78 107
79 ## Corpus Seed 108 ### Corpus Seed
80 109
81 You can pass a corpus directory to a fuzzer that you run manually: 110 You can pass a corpus directory to a fuzzer that you run manually:
82 111
83 ``` 112 ```
84 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus 113 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus
85 ``` 114 ```
86 115
87 The directory can initially be empty. The fuzzer would store all the interesting 116 The directory can initially be empty. The fuzzer would store all the interesting
88 items it finds in the directory. You can help the fuzzer by "seeding" the corpus : 117 items it finds in the directory. You can help the fuzzer by "seeding" the corpus :
89 simply copy interesting inputs for your function to the corpus directory before 118 simply copy interesting inputs for your function to the corpus directory before
90 running. This works especially well for file-parsing functionality: just 119 running. This works especially well for strictly defined file formats or data
91 use some valid files from your test suite. 120 transmission protocols.
121 * For file-parsing functionality just use some valid files from your test suite.
122 * For protocol processing targets put raw streams from test suite into separate
123 files.
92 124
93 After discovering new and interesting items, [upload corpus to ClusterFuzz]. 125 After discovering new and interesting items, [upload corpus to ClusterFuzz].
94 126
95 ## Fuzzer Dictionary 127 ### Fuzzer Dictionary
96 128
97 It is very useful to provide fuzzer a set of common words/values that you expect 129 It is very useful to provide fuzzer a set of common words/values that you expect
98 to find in the input. This greatly improves efficiency of finding new units and 130 to find in the input. This greatly improves efficiency of finding new units and
99 works especially well while fuzzing file format decoders. 131 works especially well while fuzzing file format decoders.
100 132
101 To add a dictionary, first create a dictionary file. 133 To add a dictionary, first create a dictionary file.
102 Dictionary syntax is similar to that used by [AFL] for its -x option: 134 Dictionary syntax is similar to that used by [AFL] for its -x option:
103 135
104 ``` 136 ```
105 # Lines starting with '#' and empty lines are ignored. 137 # Lines starting with '#' and empty lines are ignored.
(...skipping 25 matching lines...) Expand all
131 } 163 }
132 ``` 164 ```
133 165
134 Make sure to submit dictionary file to git. The dictionary will be used 166 Make sure to submit dictionary file to git. The dictionary will be used
135 automatically by ClusterFuzz once it picks up new fuzzer version (once a day). 167 automatically by ClusterFuzz once it picks up new fuzzer version (once a day).
136 168
137 169
138 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links 170 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links
139 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus 171 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus
140 [AFL]: http://lcamtuf.coredump.cx/afl/ 172 [AFL]: http://lcamtuf.coredump.cx/afl/
OLDNEW
« no previous file with comments | « testing/libfuzzer/clusterfuzz.md ('k') | testing/libfuzzer/getting_started.md » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698