|
|
Created:
5 years, 4 months ago by alexmos Modified:
5 years, 4 months ago CC:
chromium-reviews, darin-cc_chromium.org, jam; site-isolation-reviews_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFix flakiness in postMessage-related SitePerProcessBrowserTests.
See https://crbug.com/518729 for the complete description.
This CL fixes two underlying issues for flakiness in
SitePerProcessBrowserTest.PostMessageWithSubframeOnOpenerChain:
1. ExecuteScript* wasn't properly dealing with
domAutomationController messages that might have arrived right after
the expected response to ExecuteScript*, but before its message loop
shut down.
2. DOMMessageQueue was instantiated after the ExecuteScript* in the
test, causing it to never see the message it was waiting for if it
happened to arrive on ExecuteScript's message loop above.
BUG=518729
Committed: https://crrev.com/17e5753ea2e213295dc6a959be2ff286082faacd
Cr-Commit-Position: refs/heads/master@{#343548}
Patch Set 1 #
Total comments: 11
Patch Set 2 : Charlie's comments (and rebase) #
Messages
Total messages: 12 (3 generated)
alexmos@chromium.org changed reviewers: + creis@chromium.org
Charlie, can you please take a look? Comment #3 on https://crbug.com/518729 has the background for this. Not sure whether to re-enable PostMessageWithSubframeOnOpenerChain for Windows now or after landing this and confirming that the flakiness goes away on the flakiness dashboard.
I think this makes sense, but see my questions below. On 2015/08/14 22:15:07, alexmos wrote: > Charlie, can you please take a look? Comment #3 on https://crbug.com/518729 has > the background for this. > > Not sure whether to re-enable PostMessageWithSubframeOnOpenerChain for Windows > now or after landing this and confirming that the flakiness goes away on the > flakiness dashboard. Let's go ahead and re-enable it here. That gives us the best chance of noticing whether it goes flaky again or not. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:52: // Subtle: msg_queue needs to be declared before the ExecuteScript below. ..., or else it might miss the message of interest. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:53: // See https://crbug.com/518729. Looks like WebUIMojoTest.ConnectToApplication might have the same bug? https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... File content/public/test/browser_test_utils.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... content/public/test/browser_test_utils.cc:78: if (!did_respond_) { I'm a bit concerned about this one, but partly because I don't fully understand the message loop behavior. I'm guessing that after the Quit call below, the message loop continues to process any messages that have already queued up but stops queueing new messages. Is that correct? Won't this change mean that any queued up messages after the first DOM_OPERATION_RESPONSE will get consumed here and thus get ignored / dropped on the floor? I think that will make your msg_queue.WaitForMessage line time out as well. Or do such messages go to both observer and the DOMMessageQueue you create in the test? In that case, this one ignores it and the DOMMessageQueue processes it, and the test would pass. Is that what happens?
nasko@chromium.org changed reviewers: + nasko@chromium.org
A small drive-by nit. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:54: content::DOMMessageQueue msg_queue; nit: no need for content:: prefix, as it is already in content namespace.
> > Not sure whether to re-enable PostMessageWithSubframeOnOpenerChain for Windows > > now or after landing this and confirming that the flakiness goes away on the > > flakiness dashboard. > > Let's go ahead and re-enable it here. That gives us the best chance of noticing > whether it goes flaky again or not. Done. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:52: // Subtle: msg_queue needs to be declared before the ExecuteScript below. On 2015/08/14 22:54:13, Charlie Reis wrote: > ..., or else it might miss the message of interest. Done. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:53: // See https://crbug.com/518729. On 2015/08/14 22:54:13, Charlie Reis wrote: > Looks like WebUIMojoTest.ConnectToApplication might have the same bug? That one actually seems fine to me - it's using DOMMessageQueue after NavigateToURL, which doesn't mess with domAutomationController messages, right? I did check other uses of it, and I actually didn't find anything that's broken in the same way, at least nothing obvious. The tests I checked are fine because they declare the queue before ExecuteScript*, if using the latter. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:54: content::DOMMessageQueue msg_queue; On 2015/08/14 23:21:13, nasko wrote: > nit: no need for content:: prefix, as it is already in content namespace. Done. Thanks for noticing! https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... File content/public/test/browser_test_utils.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... content/public/test/browser_test_utils.cc:78: if (!did_respond_) { On 2015/08/14 22:54:13, Charlie Reis wrote: > I'm a bit concerned about this one, but partly because I don't fully understand > the message loop behavior. > > I'm guessing that after the Quit call below, the message loop continues to > process any messages that have already queued up but stops queueing new > messages. Is that correct? Yes, it definitely continues to process any messages that have already queued up, since I saw that while debugging. Looking at ExecuteScriptHelper, which uses this, DOMOperationObserver registers for these notifications in its constructor and removes them when it is destroyed, which would be after returning from WaitAndGetResponse (after the message loop has quit) and from ExecuteScriptHelper. > > Won't this change mean that any queued up messages after the first > DOM_OPERATION_RESPONSE will get consumed here and thus get ignored / dropped on > the floor? I think that will make your msg_queue.WaitForMessage line time out > as well. > > Or do such messages go to both observer and the DOMMessageQueue you create in > the test? In that case, this one ignores it and the DOMMessageQueue processes > it, and the test would pass. Is that what happens? It is the latter: the messages will go to both the observer and the DOMMessageQueue I create in the test, since DOMMessageQueue registers for these same notifications in its constructor. In the race condition case, this one will ignore it, and the queue will pick it up. Declaring the queue earlier does mean that it will now always pick up the return-value message from the preceding ExecuteScript that it doesn't care about, but that's already handled properly in the WaitForMessage while loop.
Thanks, LGTM. https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:53: // See https://crbug.com/518729. On 2015/08/14 23:51:32, alexmos wrote: > On 2015/08/14 22:54:13, Charlie Reis wrote: > > Looks like WebUIMojoTest.ConnectToApplication might have the same bug? > > That one actually seems fine to me - it's using DOMMessageQueue after > NavigateToURL, which doesn't mess with domAutomationController messages, right? I'm not 100% sure, but that test doesn't do anything between creating the queue and waiting for something. The message it waits for comes from the page that it navigates to. Couldn't that queue up before the DOMMessageQueue is created, the same way we see here? At any rate, that doesn't need to be fixed in this CL, if it is indeed a problem. > > I did check other uses of it, and I actually didn't find anything that's broken > in the same way, at least nothing obvious. The tests I checked are fine because > they declare the queue before ExecuteScript*, if using the latter. https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... File content/public/test/browser_test_utils.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/public/test/browser... content/public/test/browser_test_utils.cc:78: if (!did_respond_) { On 2015/08/14 23:51:32, alexmos wrote: > On 2015/08/14 22:54:13, Charlie Reis wrote: > > I'm a bit concerned about this one, but partly because I don't fully > understand > > the message loop behavior. > > > > I'm guessing that after the Quit call below, the message loop continues to > > process any messages that have already queued up but stops queueing new > > messages. Is that correct? > > Yes, it definitely continues to process any messages that have already queued > up, since I saw that while debugging. Looking at ExecuteScriptHelper, which > uses this, DOMOperationObserver registers for these notifications in its > constructor and removes them when it is destroyed, which would be after > returning from WaitAndGetResponse (after the message loop has quit) and from > ExecuteScriptHelper. > > > > > Won't this change mean that any queued up messages after the first > > DOM_OPERATION_RESPONSE will get consumed here and thus get ignored / dropped > on > > the floor? I think that will make your msg_queue.WaitForMessage line time out > > as well. > > > > Or do such messages go to both observer and the DOMMessageQueue you create in > > the test? In that case, this one ignores it and the DOMMessageQueue processes > > it, and the test would pass. Is that what happens? > > It is the latter: the messages will go to both the observer and the > DOMMessageQueue I create in the test, since DOMMessageQueue registers for these > same notifications in its constructor. In the race condition case, this one > will ignore it, and the queue will pick it up. > > Declaring the queue earlier does mean that it will now always pick up the > return-value message from the preceding ExecuteScript that it doesn't care > about, but that's already handled properly in the WaitForMessage while loop. Ok, sounds good.
Thanks! https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/1297713004/diff/1/content/browser/site_per_pr... content/browser/site_per_process_browsertest.cc:53: // See https://crbug.com/518729. On 2015/08/15 00:09:50, Charlie Reis wrote: > On 2015/08/14 23:51:32, alexmos wrote: > > On 2015/08/14 22:54:13, Charlie Reis wrote: > > > Looks like WebUIMojoTest.ConnectToApplication might have the same bug? > > > > That one actually seems fine to me - it's using DOMMessageQueue after > > NavigateToURL, which doesn't mess with domAutomationController messages, > right? > > I'm not 100% sure, but that test doesn't do anything between creating the queue > and waiting for something. > > The message it waits for comes from the page that it navigates to. Couldn't > that queue up before the DOMMessageQueue is created, the same way we see here? > > At any rate, that doesn't need to be fixed in this CL, if it is indeed a > problem. Ah, yes, it may be possible that the message could be sent from the page before NavigateToURL returns, in which case it wouldn't get into the queue, since the queue isn't created yet. I didn't try to dig deeper into how the test is structured (i.e., what other conditions must be true for the message to be sent) to tell whether that's actually possible, as it gets quite involved with mojo stuff. I couldn't find it on the flakiness dashboard, so at least that's a good sign.
The CQ bit was checked by alexmos@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1297713004/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1297713004/20001
Message was sent while issue was closed.
Committed patchset #2 (id:20001)
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/17e5753ea2e213295dc6a959be2ff286082faacd Cr-Commit-Position: refs/heads/master@{#343548} |