Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(75)

Side by Side Diff: third_party/afl/src/llvm_mode/README.llvm

Issue 2075883002: Add American Fuzzy Lop (afl) to third_party/afl/ (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Fix nits Created 4 years, 6 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « third_party/afl/src/llvm_mode/Makefile ('k') | third_party/afl/src/llvm_mode/afl-clang-fast.c » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 ============================================
2 Fast LLVM-based instrumentation for afl-fuzz
3 ============================================
4
5 (See ../docs/README for the general instruction manual.)
6
7 1) Introduction
8 ---------------
9
10 The code in this directory allows you to instrument programs for AFL using
11 true compiler-level instrumentation, instead of the more crude
12 assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
13 several interesting properties:
14
15 - The compiler can make many optimizations that are hard to pull off when
16 manually inserting assembly. As a result, some slow, CPU-bound programs will
17 run up to around 2x faster.
18
19 The gains are less pronounced for fast binaries, where the speed is limited
20 chiefly by the cost of creating new processes. In such cases, the gain will
21 probably stay within 10%.
22
23 - The instrumentation is CPU-independent. At least in principle, you should
24 be able to rely on it to fuzz programs on non-x86 architectures (after
25 building afl-fuzz with AFL_NO_X86=1).
26
27 - The instrumentation can cope a bit better with multi-threaded targets.
28
29 - Because the feature relies on the internals of LLVM, it is clang-specific
30 and will *not* work with GCC.
31
32 Once this implementation is shown to be sufficiently robust and portable, it
33 will probably replace afl-clang. For now, it can be built separately and
34 co-exists with the original code.
35
36 The idea and much of the implementation comes from Laszlo Szekeres.
37
38 2) How to use
39 -------------
40
41 In order to leverage this mechanism, you need to have clang installed on your
42 system. You should also make sure that the llvm-config tool is in your path
43 (or pointed to via LLVM_CONFIG in the environment).
44
45 Unfortunately, some systems that do have clang come without llvm-config or the
46 LLVM development headers; one example of this is FreeBSD. FreeBSD users will
47 also run into problems with clang being built statically and not being able to
48 load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so).
49
50 To solve all your problems, you can grab pre-built binaries for your OS from:
51
52 http://llvm.org/releases/download.html
53
54 ...and then put the bin/ directory from the tarball at the beginning of your
55 $PATH when compiling the feature and building packages later on. You don't need
56 to be root for that.
57
58 To build the instrumentation itself, type 'make'. This will generate binaries
59 called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this
60 is done, you can instrument third-party code in a way similar to the standard
61 operating mode of AFL, e.g.:
62
63 CC=/path/to/afl/afl-clang-fast ./configure [...options...]
64 make
65
66 Be sure to also include CXX set to afl-clang-fast++ for C++ code.
67
68 The tool honors roughly the same environmental variables as afl-gcc (see
69 ../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN,
70 AFL_HARDEN, and AFL_DONT_OPTIMIZE.
71
72 Note: if you want the LLVM helper to be installed on your system for all
73 users, you need to build it before issuing 'make install' in the parent
74 directory.
75
76 3) Gotchas, feedback, bugs
77 --------------------------
78
79 This is an early-stage mechanism, so field reports are welcome. You can send bug
80 reports to <afl-users@googlegroups.com>.
81
82 4) Bonus feature #1: deferred instrumentation
83 ---------------------------------------------
84
85 AFL tries to optimize performance by executing the targeted binary just once,
86 stopping it just before main(), and then cloning this "master" process to get
87 a steady supply of targets to fuzz.
88
89 Although this approach eliminates much of the OS-, linker- and libc-level
90 costs of executing the program, it does not always help with binaries that
91 perform other time-consuming initialization steps - say, parsing a large config
92 file before getting to the fuzzed data.
93
94 In such cases, it's beneficial to initialize the forkserver a bit later, once
95 most of the initialization work is already done, but before the binary attempts
96 to read the fuzzed input and parse it; in some cases, this can offer a 10x+
97 performance gain. You can implement delayed initialization in LLVM mode in a
98 fairly simple way.
99
100 First, find a suitable location in the code where the delayed cloning can
101 take place. This needs to be done with *extreme* care to avoid breaking the
102 binary. In particular, the program will probably malfunction if you select
103 a location after:
104
105 - The creation of any vital threads or child processes - since the forkserver
106 can't clone them easily.
107
108 - The initialization of timers via setitimer() or equivalent calls.
109
110 - The creation of temporary files, network sockets, offset-sensitive file
111 descriptors, and similar shared-state resources - but only provided that
112 their state meaningfully influences the behavior of the program later on.
113
114 - Any access to the fuzzed input, including reading the metadata about its
115 size.
116
117 With the location selected, add this code in the appropriate spot:
118
119 #ifdef __AFL_HAVE_MANUAL_CONTROL
120 __AFL_INIT();
121 #endif
122
123 You don't need the #ifdef guards, but including them ensures that the program
124 will keep working normally when compiled with a tool other than afl-clang-fast.
125
126 Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will
127 *not* generate a deferred-initialization binary) - and you should be all set!
128
129 5) Bonus feature #2: persistent mode
130 ------------------------------------
131
132 Some libraries provide APIs that are stateless, or whose state can be reset in
133 between processing different input files. When such a reset is performed, a
134 single long-lived process can be reused to try out multiple test cases,
135 eliminating the need for repeated fork() calls and the associated OS overhead.
136
137 The basic structure of the program that does this would be:
138
139 while (__AFL_LOOP(1000)) {
140
141 /* Read input data. */
142 /* Call library code to be fuzzed. */
143 /* Reset state. */
144
145 }
146
147 /* Exit normally */
148
149 The numerical value specified within the loop controls the maximum number
150 of iterations before AFL will restart the process from scratch. This minimizes
151 the impact of memory leaks and similar glitches; 1000 is a good starting point,
152 and going much higher increases the likelihood of hiccups without giving you
153 any real performance benefits.
154
155 A more detailed template is shown in ../experimental/persistent_demo/.
156 Similarly to the previous mode, the feature works only with afl-clang-fast;
157 #ifdef guards can be used to suppress it when using other compilers.
158
159 Note that as with the previous mode, the feature is easy to misuse; if you
160 do not fully reset the critical state, you may end up with false positives or
161 waste a whole lot of CPU power doing nothing useful at all. Be particularly
162 wary of memory leaks and of the state of file descriptors.
163
164 When running in this mode, the execution paths will inherently vary a bit
165 depending on whether the input loop is being entered for the first time or
166 executed again. To avoid spurious warnings, the feature implies
167 AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI.
168
169 PS. Because there are task switches still involved, the mode isn't as fast as
170 "pure" in-process fuzzing offered, say, by LLVM's LibFuzzer; but it is a lot
171 faster than the normal fork() model, and compared to in-process fuzzing,
172 should be a lot more robust.
173
174 6) Bonus feature #3: new 'trace-pc' mode
175 ----------------------------------------
176
177 Recent versions of LLVM are shipping with a built-in execution tracing feature
178 that is fairly usable for AFL, without the need to post-process the assembly
179 or install any compiler plugins. See:
180
181 http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs
182
183 As of this writing, the feature is only available on SVN trunk, and is yet to
184 make it to an official release of LLVM. Nevertheless, if you have a
185 sufficiently recent compiler and want to give it a try, build afl-clang-fast
186 this way:
187
188 AFL_TRACE_PC=1 make clean all
189
190 Since a form of 'trace-pc' is also supported in GCC, this mode may become a
191 longer-term solution to all our needs.
192
193 Note that this mode supports AFL_INST_RATIO at run time, not at compilation
194 time. This is somewhat similar to the behavior of the QEMU mode. Because of
195 the need to support it at run time, the mode is also a tad slower than the
196 plugin-based approach.
OLDNEW
« no previous file with comments | « third_party/afl/src/llvm_mode/Makefile ('k') | third_party/afl/src/llvm_mode/afl-clang-fast.c » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698