testing/libfuzzer/efficient_fuzzer.md - Issue 1855373008: [libfuzzer] update Efficient Fuzzer Guide and small fixes to documentation.

Side by Side Diff: testing/libfuzzer/efficient_fuzzer.md

Issue 1855373008: [libfuzzer] update Efficient Fuzzer Guide and small fixes to documentation. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Address comments: more sections and small reordering of them. Created 4 years, 8 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 # Efficient Fuzzer	1 # Efficient Fuzzer

2	2

3 This document describes ways to determine your fuzzer efficiency and ways	3 This document describes ways to determine your fuzzer efficiency and ways

4 to improve it.	4 to improve it.

5	5

6 ## Overview	6 ## Overview

7	7

8 Being a coverage-driven fuzzer, libFuzzer considers a certain input interesting	8 Being a coverage-driven fuzzer, libFuzzer considers a certain input interesting

9 if it results in new coverage. The set of all interesting inputs is called	9 if it results in new coverage. The set of all interesting inputs is called

10 corpus.	10 corpus.

11 Items in corpus are constantly mutated in search of new interesting input.	11 Items in corpus are constantly mutated in search of new interesting input.

12 Corpus is usually maintained between multiple fuzzer runs.	12 Corpus is usually maintained between multiple fuzzer runs.

13	13

14 There are several metrics you should look at to determine your fuzzer effectiven ess:	14 There are several metrics you should look at to determine your fuzzer effectiven ess:

15	15

16 * fuzzer speed (exec/s)	16 * [fuzzer speed](#Fuzzer-Speed) (exec/s)

17 * corpus size	17 * [corpus size](#Corpus-Size)

18 * coverage	18 * [coverage](#Coverage)

19	19

20 You can collect these metrics manually or take them from [ClusterFuzz status]	20 You can collect these metrics manually or take them from [ClusterFuzz status]

21 pages.	21 pages.

22	22

23 ## Fuzzer Speed	23 ## Fuzzer Speed

24	24

25 Fuzzer speed is printed while fuzzer runs:	25 Fuzzer speed is printed while fuzzer runs:

26	26

27 ```	27 ```

28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62	28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62

29 ```	29 ```

30	30

31 Because libFuzzer performs randomized search, it is critical to have it as fast	31 Because libFuzzer performs randomized search, it is critical to have it as fast

32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer	32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer

33 using any standard tool to see where it spends its time.	33 using any standard tool to see where it spends its time.

34	34

	35

35 ### Initialization/Cleanup	36 ### Initialization/Cleanup

36	37

37 Try to keep your fuzzing function as simple as possible. Prefer to use static	38 Try to keep your fuzzing function as simple as possible. Prefer to use static

38 initialization and shared resources rather than bringing environment up and down	39 initialization and shared resources rather than bringing environment up and down

39 every single run.	40 every single run.

40	41

41 Fuzzers don't have to shutdown gracefully (we either kill them or they crash	42 Fuzzers don't have to shutdown gracefully (we either kill them or they crash

42 because sanitizer has found a problem). You can skip freeing static resource.	43 because sanitizer has found a problem). You can skip freeing static resource.

43	44

44 Of course all resources allocated withing `LLVMFuzzerTestOneInput` function	45 Of course all resources allocated within `LLVMFuzzerTestOneInput` function

45 should be deallocated since this function is called millions of times during	46 should be deallocated since this function is called millions of times during

46 one fuzzing session.	47 one fuzzing session.

47	48

48	49

	50 ### Memory Usage

	51

	52 Avoid allocation of dynamic memory wherever possible. Instrumentation works

	53 faster for stack-based and static objects than for heap allocated ones.

	54

	55 It is always a good idea to play with different versions of a fuzzer to find the

	56 fastest implementation.

	57

	58

	59 ### Maximum Testcase Length

	60

	61 Experiment with different values of `-max_len` parameter. This parameter often

	62 significantly affects execution speed, but not always.

	63

	64 1) Define which `-max_len` value is reasonable for your target. For example, it

	65 may be useless to fuzz an image decoder with too small value of testcase length.

	66

	67 2) Increase the value defined on previous step. Check its influence on execution

	68 speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine

	69 to have some bigger value for `-max_len`.

	70

	71 In general, bigger `-max_len` value gives better coverage. Coverage is main

	72 priority for fuzzing. However, low execution speed may result in waste of

	73 resources used for fuzzing. If large inputs make fuzzer too slow you have to

	74 adjust value of `-max_len` and find a trade-off between coverage and execution

	75 speed.

	76

	77

49 ## Corpus Size	78 ## Corpus Size

50	79

51 After running for a while the fuzzer would reach a plateau and won't discover	80 After running for a while the fuzzer would reach a plateau and won't discover

52 new interesting input. Corpus for a reasonably complex functionality	81 new interesting input. Corpus for a reasonably complex functionality

53 should contain hundreds (if not thousands) of items.	82 should contain hundreds (if not thousands) of items.

54	83

55 Too small corpus size indicates some code barrier that	84 Too small corpus size indicates some code barrier that

56 libFuzzer is having problems penetrating. Common cases include: checksums,	85 libFuzzer is having problems penetrating. Common cases include: checksums,

57 magic numbers etc. The easiest way to diagnose this problem is to generate a	86 magic numbers etc. The easiest way to diagnose this problem is to generate a

58 [coverage report](#Coverage). To fix the issue you can:	87 [coverage report](#Coverage). To fix the issue you can:

59	88

60 * change the code (e.g. disable crc checks while fuzzing)	89 * change the code (e.g. disable crc checks while fuzzing)

61 * prepare [corpus seed](#Corpus-Seed).	90 * prepare [corpus seed](#Corpus-Seed)

62 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)	91 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)

63	92

64 ## Coverage	93 ## Coverage

65	94

66 You can easily generate source-level coverage report for a given corpus:	95 You can easily generate source-level coverage report for a given corpus:

67	96

68 ```	97 ```

69 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \	98 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \

70 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus	99 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus

71 ```	100 ```

72	101

73 This will produce an .html file with colored source-code. It can be used to	102 This will produce an .html file with colored source-code. It can be used to

74 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding	103 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding

75 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).	104 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).

76 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment	105 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment

77 variable.	106 variable.

78	107

79 ## Corpus Seed	108 ### Corpus Seed

80	109

81 You can pass a corpus directory to a fuzzer that you run manually:	110 You can pass a corpus directory to a fuzzer that you run manually:

82	111

83 ```	112 ```

84 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus	113 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus

85 ```	114 ```

86	115

87 The directory can initially be empty. The fuzzer would store all the interesting	116 The directory can initially be empty. The fuzzer would store all the interesting

88 items it finds in the directory. You can help the fuzzer by "seeding" the corpus :	117 items it finds in the directory. You can help the fuzzer by "seeding" the corpus :

89 simply copy interesting inputs for your function to the corpus directory before	118 simply copy interesting inputs for your function to the corpus directory before

90 running. This works especially well for file-parsing functionality: just	119 running. This works especially well for strictly defined file formats or data

91 use some valid files from your test suite.	120 transmission protocols.

	121 * For file-parsing functionality just use some valid files from your test suite.

	122 * For protocol processing targets put raw streams from test suite into separate

	123 files.

92	124

93 After discovering new and interesting items, [upload corpus to ClusterFuzz].	125 After discovering new and interesting items, [upload corpus to ClusterFuzz].

94	126

95 ## Fuzzer Dictionary	127 ### Fuzzer Dictionary

96	128

97 It is very useful to provide fuzzer a set of common words/values that you expect	129 It is very useful to provide fuzzer a set of common words/values that you expect

98 to find in the input. This greatly improves efficiency of finding new units and	130 to find in the input. This greatly improves efficiency of finding new units and

99 works especially well while fuzzing file format decoders.	131 works especially well while fuzzing file format decoders.

100	132

101 To add a dictionary, first create a dictionary file.	133 To add a dictionary, first create a dictionary file.

102 Dictionary syntax is similar to that used by [AFL] for its -x option:	134 Dictionary syntax is similar to that used by [AFL] for its -x option:

103	135

104 ```	136 ```

105 # Lines starting with '#' and empty lines are ignored.	137 # Lines starting with '#' and empty lines are ignored.

(...skipping 25 matching lines...) Expand all Loading...
131 }	163 }

132 ```	164 ```

133	165

134 Make sure to submit dictionary file to git. The dictionary will be used	166 Make sure to submit dictionary file to git. The dictionary will be used

135 automatically by ClusterFuzz once it picks up new fuzzer version (once a day).	167 automatically by ClusterFuzz once it picks up new fuzzer version (once a day).

136	168

137	169

138 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links	170 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links

139 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus	171 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus

140 [AFL]: http://lcamtuf.coredump.cx/afl/	172 [AFL]: http://lcamtuf.coredump.cx/afl/

OLD	NEW

« no previous file with comments | « testing/libfuzzer/clusterfuzz.md ('k') | testing/libfuzzer/getting_started.md » ('j') | no next file with comments »