| Index: testing/libfuzzer/efficient_fuzzer.md
|
| diff --git a/testing/libfuzzer/efficient_fuzzer.md b/testing/libfuzzer/efficient_fuzzer.md
|
| index cd8cf74b11f6fa72fab0895422c6612583596ed9..5eb0b6ed2bd203fdae933653ef015cada526dcdb 100644
|
| --- a/testing/libfuzzer/efficient_fuzzer.md
|
| +++ b/testing/libfuzzer/efficient_fuzzer.md
|
| @@ -13,9 +13,9 @@ Corpus is usually maintained between multiple fuzzer runs.
|
|
|
| There are several metrics you should look at to determine your fuzzer effectiveness:
|
|
|
| -* fuzzer speed (exec/s)
|
| -* corpus size
|
| -* coverage
|
| +* [fuzzer speed](#Fuzzer-Speed) (exec/s)
|
| +* [corpus size](#Corpus-Size)
|
| +* [coverage](#Coverage)
|
|
|
| You can collect these metrics manually or take them from [ClusterFuzz status]
|
| pages.
|
| @@ -32,6 +32,7 @@ Because libFuzzer performs randomized search, it is critical to have it as fast
|
| as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer
|
| using any standard tool to see where it spends its time.
|
|
|
| +
|
| ### Initialization/Cleanup
|
|
|
| Try to keep your fuzzing function as simple as possible. Prefer to use static
|
| @@ -41,11 +42,39 @@ every single run.
|
| Fuzzers don't have to shutdown gracefully (we either kill them or they crash
|
| because sanitizer has found a problem). You can skip freeing static resource.
|
|
|
| -Of course all resources allocated withing `LLVMFuzzerTestOneInput` function
|
| +Of course all resources allocated within `LLVMFuzzerTestOneInput` function
|
| should be deallocated since this function is called millions of times during
|
| one fuzzing session.
|
|
|
|
|
| +### Memory Usage
|
| +
|
| +Avoid allocation of dynamic memory wherever possible. Instrumentation works
|
| +faster for stack-based and static objects than for heap allocated ones.
|
| +
|
| +It is always a good idea to play with different versions of a fuzzer to find the
|
| +fastest implementation.
|
| +
|
| +
|
| +### Maximum Testcase Length
|
| +
|
| +Experiment with different values of `-max_len` parameter. This parameter often
|
| +significantly affects execution speed, but not always.
|
| +
|
| +1) Define which `-max_len` value is reasonable for your target. For example, it
|
| +may be useless to fuzz an image decoder with too small value of testcase length.
|
| +
|
| +2) Increase the value defined on previous step. Check its influence on execution
|
| +speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine
|
| +to have some bigger value for `-max_len`.
|
| +
|
| +In general, bigger `-max_len` value gives better coverage. Coverage is main
|
| +priority for fuzzing. However, low execution speed may result in waste of
|
| +resources used for fuzzing. If large inputs make fuzzer too slow you have to
|
| +adjust value of `-max_len` and find a trade-off between coverage and execution
|
| +speed.
|
| +
|
| +
|
| ## Corpus Size
|
|
|
| After running for a while the fuzzer would reach a plateau and won't discover
|
| @@ -58,7 +87,7 @@ magic numbers etc. The easiest way to diagnose this problem is to generate a
|
| [coverage report](#Coverage). To fix the issue you can:
|
|
|
| * change the code (e.g. disable crc checks while fuzzing)
|
| -* prepare [corpus seed](#Corpus-Seed).
|
| +* prepare [corpus seed](#Corpus-Seed)
|
| * prepare [fuzzer dictionary](#Fuzzer-Dictionary)
|
|
|
| ## Coverage
|
| @@ -76,7 +105,7 @@ option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).
|
| `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment
|
| variable.
|
|
|
| -## Corpus Seed
|
| +### Corpus Seed
|
|
|
| You can pass a corpus directory to a fuzzer that you run manually:
|
|
|
| @@ -87,12 +116,15 @@ You can pass a corpus directory to a fuzzer that you run manually:
|
| The directory can initially be empty. The fuzzer would store all the interesting
|
| items it finds in the directory. You can help the fuzzer by "seeding" the corpus:
|
| simply copy interesting inputs for your function to the corpus directory before
|
| -running. This works especially well for file-parsing functionality: just
|
| -use some valid files from your test suite.
|
| +running. This works especially well for strictly defined file formats or data
|
| +transmission protocols.
|
| +* For file-parsing functionality just use some valid files from your test suite.
|
| +* For protocol processing targets put raw streams from test suite into separate
|
| +files.
|
|
|
| After discovering new and interesting items, [upload corpus to ClusterFuzz].
|
|
|
| -## Fuzzer Dictionary
|
| +### Fuzzer Dictionary
|
|
|
| It is very useful to provide fuzzer a set of common words/values that you expect
|
| to find in the input. This greatly improves efficiency of finding new units and
|
|
|