Index: third_party/afl/src/llvm_mode/README.llvm |
diff --git a/third_party/afl/src/llvm_mode/README.llvm b/third_party/afl/src/llvm_mode/README.llvm |
new file mode 100644 |
index 0000000000000000000000000000000000000000..f3789e2bb9ed5edae056200f6d2d66cea749b911 |
--- /dev/null |
+++ b/third_party/afl/src/llvm_mode/README.llvm |
@@ -0,0 +1,196 @@ |
+============================================ |
+Fast LLVM-based instrumentation for afl-fuzz |
+============================================ |
+ |
+ (See ../docs/README for the general instruction manual.) |
+ |
+1) Introduction |
+--------------- |
+ |
+The code in this directory allows you to instrument programs for AFL using |
+true compiler-level instrumentation, instead of the more crude |
+assembly-level rewriting approach taken by afl-gcc and afl-clang. This has |
+several interesting properties: |
+ |
+ - The compiler can make many optimizations that are hard to pull off when |
+ manually inserting assembly. As a result, some slow, CPU-bound programs will |
+ run up to around 2x faster. |
+ |
+ The gains are less pronounced for fast binaries, where the speed is limited |
+ chiefly by the cost of creating new processes. In such cases, the gain will |
+ probably stay within 10%. |
+ |
+ - The instrumentation is CPU-independent. At least in principle, you should |
+ be able to rely on it to fuzz programs on non-x86 architectures (after |
+ building afl-fuzz with AFL_NO_X86=1). |
+ |
+ - The instrumentation can cope a bit better with multi-threaded targets. |
+ |
+ - Because the feature relies on the internals of LLVM, it is clang-specific |
+ and will *not* work with GCC. |
+ |
+Once this implementation is shown to be sufficiently robust and portable, it |
+will probably replace afl-clang. For now, it can be built separately and |
+co-exists with the original code. |
+ |
+The idea and much of the implementation comes from Laszlo Szekeres. |
+ |
+2) How to use |
+------------- |
+ |
+In order to leverage this mechanism, you need to have clang installed on your |
+system. You should also make sure that the llvm-config tool is in your path |
+(or pointed to via LLVM_CONFIG in the environment). |
+ |
+Unfortunately, some systems that do have clang come without llvm-config or the |
+LLVM development headers; one example of this is FreeBSD. FreeBSD users will |
+also run into problems with clang being built statically and not being able to |
+load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so). |
+ |
+To solve all your problems, you can grab pre-built binaries for your OS from: |
+ |
+ http://llvm.org/releases/download.html |
+ |
+...and then put the bin/ directory from the tarball at the beginning of your |
+$PATH when compiling the feature and building packages later on. You don't need |
+to be root for that. |
+ |
+To build the instrumentation itself, type 'make'. This will generate binaries |
+called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this |
+is done, you can instrument third-party code in a way similar to the standard |
+operating mode of AFL, e.g.: |
+ |
+ CC=/path/to/afl/afl-clang-fast ./configure [...options...] |
+ make |
+ |
+Be sure to also include CXX set to afl-clang-fast++ for C++ code. |
+ |
+The tool honors roughly the same environmental variables as afl-gcc (see |
+../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN, |
+AFL_HARDEN, and AFL_DONT_OPTIMIZE. |
+ |
+Note: if you want the LLVM helper to be installed on your system for all |
+users, you need to build it before issuing 'make install' in the parent |
+directory. |
+ |
+3) Gotchas, feedback, bugs |
+-------------------------- |
+ |
+This is an early-stage mechanism, so field reports are welcome. You can send bug |
+reports to <afl-users@googlegroups.com>. |
+ |
+4) Bonus feature #1: deferred instrumentation |
+--------------------------------------------- |
+ |
+AFL tries to optimize performance by executing the targeted binary just once, |
+stopping it just before main(), and then cloning this "master" process to get |
+a steady supply of targets to fuzz. |
+ |
+Although this approach eliminates much of the OS-, linker- and libc-level |
+costs of executing the program, it does not always help with binaries that |
+perform other time-consuming initialization steps - say, parsing a large config |
+file before getting to the fuzzed data. |
+ |
+In such cases, it's beneficial to initialize the forkserver a bit later, once |
+most of the initialization work is already done, but before the binary attempts |
+to read the fuzzed input and parse it; in some cases, this can offer a 10x+ |
+performance gain. You can implement delayed initialization in LLVM mode in a |
+fairly simple way. |
+ |
+First, find a suitable location in the code where the delayed cloning can |
+take place. This needs to be done with *extreme* care to avoid breaking the |
+binary. In particular, the program will probably malfunction if you select |
+a location after: |
+ |
+ - The creation of any vital threads or child processes - since the forkserver |
+ can't clone them easily. |
+ |
+ - The initialization of timers via setitimer() or equivalent calls. |
+ |
+ - The creation of temporary files, network sockets, offset-sensitive file |
+ descriptors, and similar shared-state resources - but only provided that |
+ their state meaningfully influences the behavior of the program later on. |
+ |
+ - Any access to the fuzzed input, including reading the metadata about its |
+ size. |
+ |
+With the location selected, add this code in the appropriate spot: |
+ |
+#ifdef __AFL_HAVE_MANUAL_CONTROL |
+ __AFL_INIT(); |
+#endif |
+ |
+You don't need the #ifdef guards, but including them ensures that the program |
+will keep working normally when compiled with a tool other than afl-clang-fast. |
+ |
+Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will |
+*not* generate a deferred-initialization binary) - and you should be all set! |
+ |
+5) Bonus feature #2: persistent mode |
+------------------------------------ |
+ |
+Some libraries provide APIs that are stateless, or whose state can be reset in |
+between processing different input files. When such a reset is performed, a |
+single long-lived process can be reused to try out multiple test cases, |
+eliminating the need for repeated fork() calls and the associated OS overhead. |
+ |
+The basic structure of the program that does this would be: |
+ |
+ while (__AFL_LOOP(1000)) { |
+ |
+ /* Read input data. */ |
+ /* Call library code to be fuzzed. */ |
+ /* Reset state. */ |
+ |
+ } |
+ |
+ /* Exit normally */ |
+ |
+The numerical value specified within the loop controls the maximum number |
+of iterations before AFL will restart the process from scratch. This minimizes |
+the impact of memory leaks and similar glitches; 1000 is a good starting point, |
+and going much higher increases the likelihood of hiccups without giving you |
+any real performance benefits. |
+ |
+A more detailed template is shown in ../experimental/persistent_demo/. |
+Similarly to the previous mode, the feature works only with afl-clang-fast; |
+#ifdef guards can be used to suppress it when using other compilers. |
+ |
+Note that as with the previous mode, the feature is easy to misuse; if you |
+do not fully reset the critical state, you may end up with false positives or |
+waste a whole lot of CPU power doing nothing useful at all. Be particularly |
+wary of memory leaks and of the state of file descriptors. |
+ |
+When running in this mode, the execution paths will inherently vary a bit |
+depending on whether the input loop is being entered for the first time or |
+executed again. To avoid spurious warnings, the feature implies |
+AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI. |
+ |
+PS. Because there are task switches still involved, the mode isn't as fast as |
+"pure" in-process fuzzing offered, say, by LLVM's LibFuzzer; but it is a lot |
+faster than the normal fork() model, and compared to in-process fuzzing, |
+should be a lot more robust. |
+ |
+6) Bonus feature #3: new 'trace-pc' mode |
+---------------------------------------- |
+ |
+Recent versions of LLVM are shipping with a built-in execution tracing feature |
+that is fairly usable for AFL, without the need to post-process the assembly |
+or install any compiler plugins. See: |
+ |
+ http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs |
+ |
+As of this writing, the feature is only available on SVN trunk, and is yet to |
+make it to an official release of LLVM. Nevertheless, if you have a |
+sufficiently recent compiler and want to give it a try, build afl-clang-fast |
+this way: |
+ |
+ AFL_TRACE_PC=1 make clean all |
+ |
+Since a form of 'trace-pc' is also supported in GCC, this mode may become a |
+longer-term solution to all our needs. |
+ |
+Note that this mode supports AFL_INST_RATIO at run time, not at compilation |
+time. This is somewhat similar to the behavior of the QEMU mode. Because of |
+the need to support it at run time, the mode is also a tad slower than the |
+plugin-based approach. |