OLD | NEW |
(Empty) | |
| 1 ========================================= |
| 2 strcmp() / memcmp() token capture library |
| 3 ========================================= |
| 4 |
| 5 (See ../docs/README for the general instruction manual.) |
| 6 |
| 7 This Linux-only companion library allows you to instrument strcmp(), memcmp(), |
| 8 and related functions to automatically extract syntax tokens passed to any of |
| 9 these libcalls. The resulting list of tokens may be then given as a starting |
| 10 dictionary to afl-fuzz (the -x option) to improve coverage on subsequent |
| 11 fuzzing runs. |
| 12 |
| 13 This may help improving coverage in some targets, and do precisely nothing in |
| 14 others. In some cases, it may even make things worse: if libtokencap picks up |
| 15 syntax tokens that are not used to process the input data, but that are a part |
| 16 of - say - parsing a config file... well, you're going to end up wasting a lot |
| 17 of CPU time on trying them out in the input stream. In other words, use this |
| 18 feature with care. Manually screening the resulting dictionary is almost |
| 19 always a necessity. |
| 20 |
| 21 As for the actual operation: the library stores tokens, without any deduping, |
| 22 by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not |
| 23 set, the tool uses stderr (which is probably not what you want). |
| 24 |
| 25 Similarly to afl-tmin, the library is not "proprietary" and can be used with |
| 26 other fuzzers or testing tools without the need for any code tweaks. It does not |
| 27 require AFL-instrumented binaries to work. |
| 28 |
| 29 To use the library, you *need* to make sure that your fuzzing target is compiled |
| 30 with -fno-builtin and is linked dynamically. If you wish to automate the first |
| 31 part without mucking with CFLAGS in Makefiles, you can set AFL_NO_BUILTIN=1 |
| 32 when using afl-gcc. This setting specifically adds the following flags: |
| 33 |
| 34 -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strcasecmp |
| 35 -fno-builtin-strcasencmp -fno-builtin-memcmp |
| 36 |
| 37 The next step is simply loading this library via LD_PRELOAD. The optimal usage |
| 38 pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus, |
| 39 and then fire off the target binary, with libtokencap.so loaded, on every file |
| 40 found by AFL in that earlier run. This demonstrates the basic principle: |
| 41 |
| 42 export AFL_TOKEN_FILE=$PWD/temp_output.txt |
| 43 |
| 44 for i in <out_dir>/queue/id*; do |
| 45 LD_PRELOAD=/path/to/libtokencap.so \ |
| 46 /path/to/target/program [...params, including $i...] |
| 47 done |
| 48 |
| 49 sort -u temp_output.txt >afl_dictionary.txt |
| 50 |
| 51 If you don't get any results, the target library is probably not using strcmp() |
| 52 and memcmp() to parse input; or you haven't compiled it with -fno-builtin; or |
| 53 the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect. |
| 54 |
| 55 PS. The library is Linux-only because there is probably no particularly portable |
| 56 and non-invasive way to distinguish between read-only and read-write memory |
| 57 mappings. The __tokencap_load_mappings() function is the only thing that would |
| 58 need to be changed for other OSes. Porting to platforms with /proc/<pid>/maps |
| 59 (e.g., FreeBSD) should be trivial. |
| 60 |
OLD | NEW |