|
|
Created:
4 years, 5 months ago by dcheng Modified:
4 years, 4 months ago CC:
chromium-reviews, blink-reviews, loading-reviews_chromium.org, tyoshino+watch_chromium.org, Nate Chapin, gavinp+loader_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionMove FrameLoader completion check timer to loading task runner.
WebFrameScheduler is also no longer lazily created: this avoids
problems with the frame scheduler being unexpectedly recreated
during frame detach, which often leads to use-after-frees.
BUG=624694
Committed: https://crrev.com/587adda57c8c304adcd080b5f1d6bad49ce588fd
Cr-Commit-Position: refs/heads/master@{#407767}
Patch Set 1 #Patch Set 2 : Don't lazy create the frame scheduler #
Messages
Total messages: 31 (18 generated)
The CQ bit was checked by dcheng@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
dcheng@chromium.org changed reviewers: + alexclarke@chromium.org
dcheng@chromium.org changed reviewers: + haraken@chromium.org
+haraken for platform changes.
On 2016/07/22 07:04:40, dcheng wrote: > +haraken for platform changes. LGTM on my side.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
Hmm, it appears this makes some tests unhappy. Investigating...
Let me know when the tests are happy and I'll take a look.
I poked at this a bit more today, focusing on the unit tests first. The problem is a UaF during test shutdown. I'm still working on figuring out why my CL creates this issue though. When the VirtualTimeTest.AllowVirtualTimeToAdvance finishes running, the first thing that happens is the SimTest framework is shutdown. This destroys the WebViewImpl, its WebViewScheduler, and its AutoAdvancingVirtualTimeDomain. Later on, we shutdown Blink, which calls RendererSchedulerImpl::Shutdown(). This ends up destroying the TaskQueueManager. However, the TaskQueueImpl holds a raw TimeDomain* pointer to the already deleted AutoAdvancingVirtualTimeDomain, and everything explodes when we try to call into the already-deleted object. I'm not yet entirely sure how to fix this, and I notice //components/scheduler doesn't have a README. Are there some docs I can read to understand how this architecture all works together? Also, as an aside, why is this in //components instead of //content? Do we expect to reuse these components for iOS? They seem quite Blink-specific.
Description was changed from ========== Move FrameLoader completion check timer to loading task runner. BUG=624694 ========== to ========== Move FrameLoader completion check timer to loading task runner. BUG=624694 ==========
alexclarke@chromium.org changed reviewers: + skyostil@chromium.org
On 2016/07/25 15:51:06, dcheng wrote: > I poked at this a bit more today, focusing on the unit tests first. The problem > is a UaF during test shutdown. I'm still working on figuring out why my CL > creates this issue though. > > When the VirtualTimeTest.AllowVirtualTimeToAdvance finishes running, the first > thing that happens is the SimTest framework is shutdown. This destroys the > WebViewImpl, its WebViewScheduler, and its AutoAdvancingVirtualTimeDomain. > > Later on, we shutdown Blink, which calls RendererSchedulerImpl::Shutdown(). This > ends up destroying the TaskQueueManager. However, the TaskQueueImpl holds a raw > TimeDomain* pointer to the already deleted AutoAdvancingVirtualTimeDomain, and > everything explodes when we try to call into the already-deleted object. Oh I see. Sorry about that, I'll try and fix that tomorrow. > I'm not yet entirely sure how to fix this, and I notice //components/scheduler > doesn't have a README. Are there some docs I can read to understand how this > architecture all works together? There's some discussion of VirtualTime here https://docs.google.com/document/d/1y9KDT_ZEzT7pBeY6uzVt1dgKlwc1OB_vY4NZO1zBQ... I don't think there's a current accurate overview of the Scheduler architecture. We should probably write one. > > Also, as an aside, why is this in //components instead of //content? Do we > expect to reuse these components for iOS? They seem quite Blink-specific. That's a long story, as it happens the scheduler was in //content at one time. Anyway we have plans to move it back into blink where it was originally.
Alright, (at least one of the) other issues is due to a use-after-free of the frame's BlameContext. It's a bit hard to repro: I wasn't able to trigger it when running only one test, but when I use an asan build, the following command: out_asan/Release/content_browsertests --gtest_filter=SitePerProcess* --test-launcher-bot-mode --test-launcher-retry-limit=0 |& tools/valgrind/asan/asan_symbolize.py The stack shows that the freed pointer being derefed is a BlameContext, which is an unowned pointer that (optionally?) is associated with a TaskQueueImpl. When we detach a frame, we call into the embedder (content::RenderFrameImpl::frameDetached) to do some cleanup. This destroys the FrameBlameContext. The corresponding Timer in FrameLoader is also cancelled, and the queued Task is abandoned. However, queued Tasks are still in the TaskQueueImpl: when it tries to process the pending tasks, it calls NotifyWillProcessTask, which references the already deleted FrameBlameContext. This is a bit surprising, because it looks like we go to a lot of effort to try to clean up the WebFrameSchedulerImpl when we detach: in fact, we reset the pointer twice. In theory, this should delete the associated TaskQueues and unregister it from the TaskQueueManager so new tasks don't get queued. 1) I'm not sure why we lazily-create the frame scheduler, shouldn't we just eagerly initialize it when we create the frame? 2) Similarly, I think we should just reset the frame scheduler later in LocalFrame::detach to avoid the need to double-reset.
The CQ bit was checked by dcheng@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Well, it looks like not lazily creating the scheduler fixes all the problems (including the unit tests). PTAL!
Description was changed from ========== Move FrameLoader completion check timer to loading task runner. BUG=624694 ========== to ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ==========
Description was changed from ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ========== to ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler is also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ==========
On 2016/07/26 11:05:31, dcheng wrote: > Well, it looks like not lazily creating the scheduler fixes all the problems > (including the unit tests). PTAL! Oh interesting, that matches more closely the non-test behavior. LGTM
The CQ bit was checked by dcheng@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from haraken@chromium.org Link to the patchset: https://codereview.chromium.org/2172153002/#ps20001 (title: "Don't lazy create the frame scheduler")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler is also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ========== to ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler is also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ==========
Message was sent while issue was closed.
Committed patchset #2 (id:20001)
Message was sent while issue was closed.
Description was changed from ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler is also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 ========== to ========== Move FrameLoader completion check timer to loading task runner. WebFrameScheduler is also no longer lazily created: this avoids problems with the frame scheduler being unexpectedly recreated during frame detach, which often leads to use-after-frees. BUG=624694 Committed: https://crrev.com/587adda57c8c304adcd080b5f1d6bad49ce588fd Cr-Commit-Position: refs/heads/master@{#407767} ==========
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/587adda57c8c304adcd080b5f1d6bad49ce588fd Cr-Commit-Position: refs/heads/master@{#407767} |