| Index: third_party/afl/src/llvm_mode/README.llvm | 
| diff --git a/third_party/afl/src/llvm_mode/README.llvm b/third_party/afl/src/llvm_mode/README.llvm | 
| new file mode 100644 | 
| index 0000000000000000000000000000000000000000..f3789e2bb9ed5edae056200f6d2d66cea749b911 | 
| --- /dev/null | 
| +++ b/third_party/afl/src/llvm_mode/README.llvm | 
| @@ -0,0 +1,196 @@ | 
| +============================================ | 
| +Fast LLVM-based instrumentation for afl-fuzz | 
| +============================================ | 
| + | 
| +  (See ../docs/README for the general instruction manual.) | 
| + | 
| +1) Introduction | 
| +--------------- | 
| + | 
| +The code in this directory allows you to instrument programs for AFL using | 
| +true compiler-level instrumentation, instead of the more crude | 
| +assembly-level rewriting approach taken by afl-gcc and afl-clang. This has | 
| +several interesting properties: | 
| + | 
| +  - The compiler can make many optimizations that are hard to pull off when | 
| +    manually inserting assembly. As a result, some slow, CPU-bound programs will | 
| +    run up to around 2x faster. | 
| + | 
| +    The gains are less pronounced for fast binaries, where the speed is limited | 
| +    chiefly by the cost of creating new processes. In such cases, the gain will | 
| +    probably stay within 10%. | 
| + | 
| +  - The instrumentation is CPU-independent. At least in principle, you should | 
| +    be able to rely on it to fuzz programs on non-x86 architectures (after | 
| +    building afl-fuzz with AFL_NO_X86=1). | 
| + | 
| +  - The instrumentation can cope a bit better with multi-threaded targets. | 
| + | 
| +  - Because the feature relies on the internals of LLVM, it is clang-specific | 
| +    and will *not* work with GCC. | 
| + | 
| +Once this implementation is shown to be sufficiently robust and portable, it | 
| +will probably replace afl-clang. For now, it can be built separately and | 
| +co-exists with the original code. | 
| + | 
| +The idea and much of the implementation comes from Laszlo Szekeres. | 
| + | 
| +2) How to use | 
| +------------- | 
| + | 
| +In order to leverage this mechanism, you need to have clang installed on your | 
| +system. You should also make sure that the llvm-config tool is in your path | 
| +(or pointed to via LLVM_CONFIG in the environment). | 
| + | 
| +Unfortunately, some systems that do have clang come without llvm-config or the | 
| +LLVM development headers; one example of this is FreeBSD. FreeBSD users will | 
| +also run into problems with clang being built statically and not being able to | 
| +load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so). | 
| + | 
| +To solve all your problems, you can grab pre-built binaries for your OS from: | 
| + | 
| +  http://llvm.org/releases/download.html | 
| + | 
| +...and then put the bin/ directory from the tarball at the beginning of your | 
| +$PATH when compiling the feature and building packages later on. You don't need | 
| +to be root for that. | 
| + | 
| +To build the instrumentation itself, type 'make'. This will generate binaries | 
| +called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this | 
| +is done, you can instrument third-party code in a way similar to the standard | 
| +operating mode of AFL, e.g.: | 
| + | 
| +  CC=/path/to/afl/afl-clang-fast ./configure [...options...] | 
| +  make | 
| + | 
| +Be sure to also include CXX set to afl-clang-fast++ for C++ code. | 
| + | 
| +The tool honors roughly the same environmental variables as afl-gcc (see | 
| +../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN, | 
| +AFL_HARDEN, and AFL_DONT_OPTIMIZE. | 
| + | 
| +Note: if you want the LLVM helper to be installed on your system for all | 
| +users, you need to build it before issuing 'make install' in the parent | 
| +directory. | 
| + | 
| +3) Gotchas, feedback, bugs | 
| +-------------------------- | 
| + | 
| +This is an early-stage mechanism, so field reports are welcome. You can send bug | 
| +reports to <afl-users@googlegroups.com>. | 
| + | 
| +4) Bonus feature #1: deferred instrumentation | 
| +--------------------------------------------- | 
| + | 
| +AFL tries to optimize performance by executing the targeted binary just once, | 
| +stopping it just before main(), and then cloning this "master" process to get | 
| +a steady supply of targets to fuzz. | 
| + | 
| +Although this approach eliminates much of the OS-, linker- and libc-level | 
| +costs of executing the program, it does not always help with binaries that | 
| +perform other time-consuming initialization steps - say, parsing a large config | 
| +file before getting to the fuzzed data. | 
| + | 
| +In such cases, it's beneficial to initialize the forkserver a bit later, once | 
| +most of the initialization work is already done, but before the binary attempts | 
| +to read the fuzzed input and parse it; in some cases, this can offer a 10x+ | 
| +performance gain. You can implement delayed initialization in LLVM mode in a | 
| +fairly simple way. | 
| + | 
| +First, find a suitable location in the code where the delayed cloning can | 
| +take place. This needs to be done with *extreme* care to avoid breaking the | 
| +binary. In particular, the program will probably malfunction if you select | 
| +a location after: | 
| + | 
| +  - The creation of any vital threads or child processes - since the forkserver | 
| +    can't clone them easily. | 
| + | 
| +  - The initialization of timers via setitimer() or equivalent calls. | 
| + | 
| +  - The creation of temporary files, network sockets, offset-sensitive file | 
| +    descriptors, and similar shared-state resources - but only provided that | 
| +    their state meaningfully influences the behavior of the program later on. | 
| + | 
| +  - Any access to the fuzzed input, including reading the metadata about its | 
| +    size. | 
| + | 
| +With the location selected, add this code in the appropriate spot: | 
| + | 
| +#ifdef __AFL_HAVE_MANUAL_CONTROL | 
| +  __AFL_INIT(); | 
| +#endif | 
| + | 
| +You don't need the #ifdef guards, but including them ensures that the program | 
| +will keep working normally when compiled with a tool other than afl-clang-fast. | 
| + | 
| +Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will | 
| +*not* generate a deferred-initialization binary) - and you should be all set! | 
| + | 
| +5) Bonus feature #2: persistent mode | 
| +------------------------------------ | 
| + | 
| +Some libraries provide APIs that are stateless, or whose state can be reset in | 
| +between processing different input files. When such a reset is performed, a | 
| +single long-lived process can be reused to try out multiple test cases, | 
| +eliminating the need for repeated fork() calls and the associated OS overhead. | 
| + | 
| +The basic structure of the program that does this would be: | 
| + | 
| +  while (__AFL_LOOP(1000)) { | 
| + | 
| +    /* Read input data. */ | 
| +    /* Call library code to be fuzzed. */ | 
| +    /* Reset state. */ | 
| + | 
| +  } | 
| + | 
| +  /* Exit normally */ | 
| + | 
| +The numerical value specified within the loop controls the maximum number | 
| +of iterations before AFL will restart the process from scratch. This minimizes | 
| +the impact of memory leaks and similar glitches; 1000 is a good starting point, | 
| +and going much higher increases the likelihood of hiccups without giving you | 
| +any real performance benefits. | 
| + | 
| +A more detailed template is shown in ../experimental/persistent_demo/. | 
| +Similarly to the previous mode, the feature works only with afl-clang-fast; | 
| +#ifdef guards can be used to suppress it when using other compilers. | 
| + | 
| +Note that as with the previous mode, the feature is easy to misuse; if you | 
| +do not fully reset the critical state, you may end up with false positives or | 
| +waste a whole lot of CPU power doing nothing useful at all. Be particularly | 
| +wary of memory leaks and of the state of file descriptors. | 
| + | 
| +When running in this mode, the execution paths will inherently vary a bit | 
| +depending on whether the input loop is being entered for the first time or | 
| +executed again. To avoid spurious warnings, the feature implies | 
| +AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI. | 
| + | 
| +PS. Because there are task switches still involved, the mode isn't as fast as | 
| +"pure" in-process fuzzing offered, say, by LLVM's LibFuzzer; but it is a lot | 
| +faster than the normal fork() model, and compared to in-process fuzzing, | 
| +should be a lot more robust. | 
| + | 
| +6) Bonus feature #3: new 'trace-pc' mode | 
| +---------------------------------------- | 
| + | 
| +Recent versions of LLVM are shipping with a built-in execution tracing feature | 
| +that is fairly usable for AFL, without the need to post-process the assembly | 
| +or install any compiler plugins. See: | 
| + | 
| +  http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs | 
| + | 
| +As of this writing, the feature is only available on SVN trunk, and is yet to | 
| +make it to an official release of LLVM. Nevertheless, if you have a | 
| +sufficiently recent compiler and want to give it a try, build afl-clang-fast | 
| +this way: | 
| + | 
| +  AFL_TRACE_PC=1 make clean all | 
| + | 
| +Since a form of 'trace-pc' is also supported in GCC, this mode may become a | 
| +longer-term solution to all our needs. | 
| + | 
| +Note that this mode supports AFL_INST_RATIO at run time, not at compilation | 
| +time. This is somewhat similar to the behavior of the QEMU mode. Because of | 
| +the need to support it at run time, the mode is also a tad slower than the | 
| +plugin-based approach. | 
|  |