testing/libfuzzer/efficient_fuzzer.md - Issue 1855373008: [libfuzzer] update Efficient Fuzzer Guide and small fixes to documentation.

Side by Side Diff: testing/libfuzzer/efficient_fuzzer.md

Issue 1855373008: [libfuzzer] update Efficient Fuzzer Guide and small fixes to documentation. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Created 4 years, 8 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 # Efficient Fuzzer	1 # Efficient Fuzzer

2	2

3 This document describes ways to determine your fuzzer efficiency and ways	3 This document describes ways to determine your fuzzer efficiency and ways

4 to improve it.	4 to improve it.

5	5

6 ## Overview	6 ## Overview

7	7

8 Being a coverage-driven fuzzer, libFuzzer considers a certain input interesting	8 Being a coverage-driven fuzzer, libFuzzer considers a certain input interesting

9 if it results in new coverage. The set of all interesting inputs is called	9 if it results in new coverage. The set of all interesting inputs is called

10 corpus.	10 corpus.

(...skipping 14 matching lines...) Expand all Loading...
25 Fuzzer speed is printed while fuzzer runs:	25 Fuzzer speed is printed while fuzzer runs:

26	26

27 ```	27 ```

28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62	28 #19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62

29 ```	29 ```

30	30

31 Because libFuzzer performs randomized search, it is critical to have it as fast	31 Because libFuzzer performs randomized search, it is critical to have it as fast

32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer	32 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer

33 using any standard tool to see where it spends its time.	33 using any standard tool to see where it spends its time.

34	34

	35 Avoid allocation of dynamic memory whereever is possible. Instrumentation works
	Oliver Chang 2016/04/06 17:13:44 nit: s/whereever/wherever/ nit: no need for the "i nit: s/whereever/wherever/ nit: no need for the "is" mmoroz 2016/04/06 17:19:20 Done. Show quoted text On 2016/04/06 17:13:44, Oliver Chang wrote: > nit: s/whereever/wherever/ > nit: no need for the "is" Done.
	36 faster for stack-based and static objects than for heap allocated ones.

	37

	38 Experiment with different values of `-max_len` parameter. This parameter often

	39 significantly affects execution speed, but not always.

	40

	41 1) Define which `-max_len` value is reasonable for your target. For example, it

	42 may be useless to fuzz an image decoder with too small value for `-max_len`.

	43 2) Increase the value defined on previous step. Check its influence on execution

	44 speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine

	45 to have some bigger value for `-max_len`.

	46

	47 In general, bigger `-max_len` value gives better coverage. Coverage is main

	48 priority for fuzzing. However, low execution speed may result in waste of

	49 resources used for fuzzing. If large inputs make fuzzer too slow you have to

	50 adjust value of `-max_len` and find a trade-off between coverage and execution

	51 speed.

	52

35 ### Initialization/Cleanup	53 ### Initialization/Cleanup

36	54

37 Try to keep your fuzzing function as simple as possible. Prefer to use static	55 Try to keep your fuzzing function as simple as possible. Prefer to use static

38 initialization and shared resources rather than bringing environment up and down	56 initialization and shared resources rather than bringing environment up and down

39 every single run.	57 every single run.

40	58

41 Fuzzers don't have to shutdown gracefully (we either kill them or they crash	59 Fuzzers don't have to shutdown gracefully (we either kill them or they crash

42 because sanitizer has found a problem). You can skip freeing static resource.	60 because sanitizer has found a problem). You can skip freeing static resource.

43	61

44 Of course all resources allocated withing `LLVMFuzzerTestOneInput` function	62 Of course all resources allocated within `LLVMFuzzerTestOneInput` function

45 should be deallocated since this function is called millions of times during	63 should be deallocated since this function is called millions of times during

46 one fuzzing session.	64 one fuzzing session.

47	65

48	66

49 ## Corpus Size	67 ## Corpus Size

50	68

51 After running for a while the fuzzer would reach a plateau and won't discover	69 After running for a while the fuzzer would reach a plateau and won't discover

52 new interesting input. Corpus for a reasonably complex functionality	70 new interesting input. Corpus for a reasonably complex functionality

53 should contain hundreds (if not thousands) of items.	71 should contain hundreds (if not thousands) of items.

54	72

55 Too small corpus size indicates some code barrier that	73 Too small corpus size indicates some code barrier that

56 libFuzzer is having problems penetrating. Common cases include: checksums,	74 libFuzzer is having problems penetrating. Common cases include: checksums,

57 magic numbers etc. The easiest way to diagnose this problem is to generate a	75 magic numbers etc. The easiest way to diagnose this problem is to generate a

58 [coverage report](#Coverage). To fix the issue you can:	76 [coverage report](#Coverage). To fix the issue you can:

59	77

60 * change the code (e.g. disable crc checks while fuzzing)	78 * change the code (e.g. disable crc checks while fuzzing)

61 * prepare [corpus seed](#Corpus-Seed).	79 * prepare [corpus seed](#Corpus-Seed)

62 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)	80 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)

63	81

64 ## Coverage	82 ## Coverage

65	83

66 You can easily generate source-level coverage report for a given corpus:	84 You can easily generate source-level coverage report for a given corpus:

67	85

68 ```	86 ```

69 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \	87 ASAN_OPTIONS=html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asse rts/bin/sancov \

70 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus	88 ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus

71 ```	89 ```

72	90

73 This will produce an .html file with colored source-code. It can be used to	91 This will produce an .html file with colored source-code. It can be used to

74 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding	92 determine where your fuzzer is "stuck". Replace `ASAN_OPTIONS` by corresponding

75 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).	93 option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).

76 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment	94 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment

77 variable.	95 variable.

78	96

79 ## Corpus Seed	97 ## Corpus Seed

80	98

81 You can pass a corpus directory to a fuzzer that you run manually:	99 You can pass a corpus directory to a fuzzer that you run manually:

82	100

83 ```	101 ```

84 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus	102 ./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus

85 ```	103 ```

86	104

87 The directory can initially be empty. The fuzzer would store all the interesting	105 The directory can initially be empty. The fuzzer would store all the interesting

88 items it finds in the directory. You can help the fuzzer by "seeding" the corpus :	106 items it finds in the directory. You can help the fuzzer by "seeding" the corpus :

89 simply copy interesting inputs for your function to the corpus directory before	107 simply copy interesting inputs for your function to the corpus directory before

90 running. This works especially well for file-parsing functionality: just	108 running. This works especially well for strictly defined file formats or data

91 use some valid files from your test suite.	109 transmission protocols.

	110 * For file-parsing functionality just use some valid files from your test suite.

	111 * For protocol processing targets put raw streams from test suite into separate

	112 files.

92	113

93 After discovering new and interesting items, [upload corpus to ClusterFuzz].	114 After discovering new and interesting items, [upload corpus to ClusterFuzz].

94	115

95 ## Fuzzer Dictionary	116 ## Fuzzer Dictionary

96	117

97 It is very useful to provide fuzzer a set of common words/values that you expect	118 It is very useful to provide fuzzer a set of common words/values that you expect

98 to find in the input. This greatly improves efficiency of finding new units and	119 to find in the input. This greatly improves efficiency of finding new units and

99 works especially well while fuzzing file format decoders.	120 works especially well while fuzzing file format decoders.

100	121

101 To add a dictionary, first create a dictionary file.	122 To add a dictionary, first create a dictionary file.

(...skipping 29 matching lines...) Expand all Loading...
131 }	152 }

132 ```	153 ```

133	154

134 Make sure to submit dictionary file to git. The dictionary will be used	155 Make sure to submit dictionary file to git. The dictionary will be used

135 automatically by ClusterFuzz once it picks up new fuzzer version (once a day).	156 automatically by ClusterFuzz once it picks up new fuzzer version (once a day).

136	157

137	158

138 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links	159 [ClusterFuzz status]: ./clusterfuzz.md#Status-Links

139 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus	160 [upload corpus to ClusterFuzz]: ./clusterfuzz.md#Upload-Corpus

140 [AFL]: http://lcamtuf.coredump.cx/afl/	161 [AFL]: http://lcamtuf.coredump.cx/afl/

OLD	NEW

« no previous file with comments | « testing/libfuzzer/clusterfuzz.md ('k') | testing/libfuzzer/getting_started.md » ('j') | no next file with comments »