|
|
Created:
4 years, 4 months ago by Charlie Harrison Modified:
4 years, 3 months ago CC:
chromium-reviews, Yoav Weiss, blink-reviews-html_chromium.org, loading-reviews+parser_chromium.org, dglazkov+blink, blink-reviews, kinuko+watch Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd fuzzer for HTMLPreloadScanner
BUG=
Committed: https://crrev.com/8292c59774cac439c34c0692d22e321e2f6e8b18
Cr-Commit-Position: refs/heads/master@{#416318}
Patch Set 1 #
Total comments: 4
Patch Set 2 : kouhei first look #Patch Set 3 : rebase #
Total comments: 2
Patch Set 4 : add include #
Messages
Total messages: 35 (18 generated)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
csharrison@chromium.org changed reviewers: + kouhei@chromium.org
Non-urgent request for second pair of eyes. Here's a fuzzer for the HTMLPreloadScanner (which will fuzz the tokenizer, css scanner, etc). I'm getting a crash after running it for a while which I can't reproduce in chrome, so I'm thinking that I may be breaking a code contract somewhere. Here is the crash stack trace: ==11614==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000028 (pc 0x7f7e09dfe2be bp 0x7ffcee0ab1a0 sp 0x7ffcee0ab180 T0) ==11614==The signal is caused by a READ memory access. ==11614==Hint: address points to the zero page. #0 0x7f7e09dfe2bd in ?? ./out/libfuzzer/../../third_party/WebKit/Source/wtf/RefPtr.h:59:43 #1 0x7f7e09dfdffd in ?? ./out/libfuzzer/../../third_party/WebKit/Source/wtf/text/WTFString.h:109:46 #2 0x7f7e09dfe47d in ?? ./out/libfuzzer/../../third_party/WebKit/Source/wtf/text/AtomicString.h:67:48 #3 0x7f7e0d00382e in ?? ./out/libfuzzer/../../third_party/WebKit/Source/core/html/parser/HTMLParserIdioms.cpp:438:64 #4 0x7f7e0d0853ee in ?? ./out/libfuzzer/../../third_party/WebKit/Source/core/html/parser/HTMLTokenizer.cpp:1549:9 #5 0x7f7e0d01ba68 in scanAndPreload ./out/libfuzzer/../../third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.cpp:799:26 #6 0x50cc43 in LLVMFuzzerTestOneInput ./out/libfuzzer/../../third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:93:14 #7 0x50cfad in ?? ./out/libfuzzer/../../third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:101:12 #8 0x556ce4 in ExecuteCallback ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerLoop.cpp:512:13 #9 0x552cae in RunOne ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerLoop.cpp:468:3 #10 0x55721a in ?? ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerLoop.cpp:489:7 #11 0x55e539 in MutateAndTestOne ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerLoop.cpp:702:5 #12 0x560eae in Loop ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerLoop.cpp:790:5 #13 0x51c159 in FuzzerDriver ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerDriver.cpp:417:7 #14 0x598ebe in ?? ./out/libfuzzer/../../third_party/libFuzzer/src/FuzzerMain.cpp:21:10 #15 0x7f7de7752f44 in __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:287:0 With input: 0x3c,0x6b,0x65,0xb6,0xb6,0xb6,0xb6,0x3c,0x10,0xb5,0x3c,0x21,0x3b,0x3b,0x3b,0x3b,0x3b,0x3b,0x3b,0x3b,0x3e,0x27,0x3c,0x65,0x28,0x29,0x2a,0x3a,0x1,0x21,0xc9,0x22,0x21,0x21,0x3c,0x3c,0x28,0xe8, <ke\xb6\xb6\xb6\xb6<\x10\xb5<!;;;;;;;;>'<e()*:\x01!\xc9\"!!<<(\xe8 Base64: PGtltra2tjwQtTwhOzs7Ozs7Ozs+JzxlKCkqOgEhySIhITw8KOg=
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
Wow. Thanks! https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h (right): https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h:75: CachedDocumentParameters() {} public: static std::unique_ptr<CachedDocumentParameters> create() { return wrapUnique(new CachedDocumentParameters); } private: CachedDocumentParameters() = default; ? https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp (right): https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:47: { maybe simply unique_ptr<CachedDocumentParameters> createCachedDocumentParametersForFuzzing() { ... }? Also, please initialize the other fields.
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
Let's hold off on this review. I think there are subtle issues with how the unit tests are set up which are causing some fuzzers to crash. I have a really simple CSS parser fuzzer that is crashing on trivial input. I think the solution is to bring up more infrastructure so libfuzzers have the same environment as unit tests. https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h (right): https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h:75: CachedDocumentParameters() {} On 2016/08/22 05:48:28, kouhei wrote: > public: > static std::unique_ptr<CachedDocumentParameters> create() { return > wrapUnique(new CachedDocumentParameters); } > > private: > CachedDocumentParameters() = default; > ? I'm inclined to keep a protected constructor here because fuzzers are optimized for stack allocations over heap allocations. Given your other comment, though, it may be better for style to just do this.
https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h (right): https://codereview.chromium.org/2261873002/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/html/parser/HTMLPreloadScanner.h:75: CachedDocumentParameters() {} On 2016/08/23 01:32:28, Charlie Harrison wrote: > On 2016/08/22 05:48:28, kouhei wrote: > > public: > > static std::unique_ptr<CachedDocumentParameters> create() { return > > wrapUnique(new CachedDocumentParameters); } > > > > private: > > CachedDocumentParameters() = default; > > ? > > I'm inclined to keep a protected constructor here because fuzzers are optimized > for stack allocations over heap allocations. > > Given your other comment, though, it may be better for style to just do this. I'm fine either way. Not a blocker. However we seem to be heap alloc-ing this anyway in Fuzzer.cpp: "CachedDocumentParametersForFuzzing* documentParameters = new CachedDocumentParametersForFuzzing(fuzzedData);"
FYI I have rebased this on a dependent CL adding more fuzzing infra to blink. Feel free to review. I would also be fine simplifying this and just using a String::fromUTF8WithLatin1Fallback(<raw_data>) here instead of fuzzing the TextResourceDecoder.
lgtm
csharrison@chromium.org changed reviewers: + mmoroz@chromium.org
mmoroz@ would you review this fuzzer too? It also relies on test infra from the dependent patch (which hopefully will land soon).
LGTM https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp (right): https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:66: CString bytes = fuzzedData.ConsumeRemainingBytes(); Can we estimate how many bytes will be read from the |data| until that line? What I'm concerned about: we have a plenty of html-documents as a seed corpus, that's great. But a couple of first bytes of these documents will be used to initialize document's and parser's options. Thus, |ConsumeRemainingBytes()| will always return a sort of invalid HTML for testcases from seed corpus. Is it be a problem for |scanAndPreload()| call or not? My concern is not a blocker at all, just a thing which may worth to observe. I think we should land it and see how it goes, then we may upgrade the corpus and/or add a dictionary as an improvement (if it will make sense, of course).
https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... File third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp (right): https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:66: CString bytes = fuzzedData.ConsumeRemainingBytes(); On 2016/08/30 19:01:53, mmoroz wrote: > Can we estimate how many bytes will be read from the |data| until that line? > > What I'm concerned about: we have a plenty of html-documents as a seed corpus, > that's great. But a couple of first bytes of these documents will be used to > initialize document's and parser's options. Thus, |ConsumeRemainingBytes()| will > always return a sort of invalid HTML for testcases from seed corpus. Is it be a > problem for |scanAndPreload()| call or not? > > My concern is not a blocker at all, just a thing which may worth to observe. I > think we should land it and see how it goes, then we may upgrade the corpus > and/or add a dictionary as an improvement (if it will make sense, of course). Anywhere from 14-78 bytes will be consumed. It's possible that fuzzing the TextResourceDecoder is overkill for this fuzzer, as I mentioned in a previous comment. I'm happy to re-evaluate at a later point though. Note that the ConsumeBool and ConsumeUint32InRange calls take data from the end of the fuzzed data, not the beginning. Your concern is very valid. How big a deal is it for seed corpuses to be slightly invalid? I assumed this is not a big deal because tiny deltas like this are easy for a fuzzer to fix up eventually.
On 2016/08/30 19:22:35, Charlie Harrison wrote: > https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... > File third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp > (right): > > https://codereview.chromium.org/2261873002/diff/40001/third_party/WebKit/Sour... > third_party/WebKit/Source/core/html/parser/HTMLPreloadScannerFuzzer.cpp:66: > CString bytes = fuzzedData.ConsumeRemainingBytes(); > On 2016/08/30 19:01:53, mmoroz wrote: > > Can we estimate how many bytes will be read from the |data| until that line? > > > > What I'm concerned about: we have a plenty of html-documents as a seed corpus, > > that's great. But a couple of first bytes of these documents will be used to > > initialize document's and parser's options. Thus, |ConsumeRemainingBytes()| > will > > always return a sort of invalid HTML for testcases from seed corpus. Is it be > a > > problem for |scanAndPreload()| call or not? > > > > My concern is not a blocker at all, just a thing which may worth to observe. I > > think we should land it and see how it goes, then we may upgrade the corpus > > and/or add a dictionary as an improvement (if it will make sense, of course). > > Anywhere from 14-78 bytes will be consumed. It's possible that fuzzing the > TextResourceDecoder is overkill for this fuzzer, as I mentioned in a previous > comment. I'm happy to re-evaluate at a later point though. > > Note that the ConsumeBool and ConsumeUint32InRange calls take data from the end > of the fuzzed data, not the beginning. > > Your concern is very valid. How big a deal is it for seed corpuses to be > slightly invalid? I assumed this is not a big deal because tiny deltas like this > are easy for a fuzzer to fix up eventually. Yes, I also believe that it shouldn't be a serious issue. LibFuzzer is pretty smart to deal with it, especially when running at a big scale :) Thanks for noting that ConsumeBool and ConsumeUint32InRange calls take data from the end. That's great, I didn't know it!
After thinking about it I'd like to move this patchset away from using the TextResourceDecoderForFuzzing, and just using String::FromUtf8WithLatin1Fallback for simplicity and so the input strings closer to html. I don't think composing these two fuzzers really gives us much.
On 2016/08/31 13:51:07, Charlie Harrison wrote: > After thinking about it I'd like to move this patchset away from using the > TextResourceDecoderForFuzzing, and just using String::FromUtf8WithLatin1Fallback > for simplicity and so the input strings closer to html. > > I don't think composing these two fuzzers really gives us much. I changed my mind on this. Let's just land and see how things go.
The CQ bit was checked by csharrison@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by csharrison@chromium.org
The CQ bit was checked by csharrison@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from mmoroz@chromium.org, kouhei@chromium.org Link to the patchset: https://codereview.chromium.org/2261873002/#ps60001 (title: "add include")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Description was changed from ========== Add fuzzer for HTMLPreloadScanner BUG= ========== to ========== Add fuzzer for HTMLPreloadScanner BUG= Committed: https://crrev.com/8292c59774cac439c34c0692d22e321e2f6e8b18 Cr-Commit-Position: refs/heads/master@{#416318} ==========
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/8292c59774cac439c34c0692d22e321e2f6e8b18 Cr-Commit-Position: refs/heads/master@{#416318} |