|
|
Chromium Code Reviews|
Created:
5 years ago by falken Modified:
5 years ago CC:
chromium-reviews, michaeln, jsbell+serviceworker_chromium.org, tyoshino+watch_chromium.org, serviceworker-reviews, creis+watch_chromium.org, tzik, nasko+codewatch_chromium.org, jam, nhiroki, loading-reviews+fetch_chromium.org, darin-cc_chromium.org, gavinp+loader_chromium.org, horo+watch_chromium.org, blink-reviews, kinuko+serviceworker, kinuko+watch, Nate Chapin, blink-worker-reviews_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Description[service worker] Clear controller when set to invalid handle id
We try to escape an unresponsive SW for main resource
requests by sending a controller change IPC to the renderer with
a kInvalidServiceWorkerHandleId.
Before this patch, since we set the controller to an
"invalid handle" rather than null, some code still thinks
a controller exists, notably DocumentThreadableLoader thinks
it has a controller, which eventually causes assert failures.
This patch fixes things by setting the controller to null.
BUG=561988
Committed: https://crrev.com/f326225e72d172ec369cb7d551f8c8b4b963e4d7
Cr-Commit-Position: refs/heads/master@{#362624}
Patch Set 1 #
Total comments: 2
Patch Set 2 : fix controller #
Total comments: 6
Patch Set 3 : return null in Create/Adopt #
Messages
Total messages: 28 (12 generated)
Description was changed from ========== [service worker] Fix detach controller and fallback to network mechanism Before this patch, we tried to bail out of broken SW by sending a controller change IPC to the renderer while retrying the request without the SW on the browser side. This is flawed because: - It's racy: the renderer still thinks it's controlled by the service worker until the IPC arrives - It's buggy: even after the IPC arrives, because we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists. This patch takes the approach of asking the renderer to retry the request without the SW, much like the "fallback for CORS" mechanism. BUG=561988 ========== to ========== [service worker] Fix detach controller and fallback to network mechanism Before this patch, we tried to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer while retrying the request without the SW on the browser side. This is flawed because: - It's racy: the renderer still thinks it's controlled by the service worker until the IPC arrives - It's buggy: even after the IPC arrives, because we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists. This patch takes the approach of asking the renderer to retry the request without the SW, much like the "fallback for CORS" mechanism. BUG=561988 ==========
Description was changed from ========== [service worker] Fix detach controller and fallback to network mechanism Before this patch, we tried to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer while retrying the request without the SW on the browser side. This is flawed because: - It's racy: the renderer still thinks it's controlled by the service worker until the IPC arrives - It's buggy: even after the IPC arrives, because we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists. This patch takes the approach of asking the renderer to retry the request without the SW, much like the "fallback for CORS" mechanism. BUG=561988 ========== to ========== [service worker] Fix detach controller and fallback to network mechanism Before this patch, we tried to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer while retrying the request on the browser side without the SW. This is flawed because: - It's racy: the renderer still thinks it's controlled by the service worker until the IPC arrives - It's buggy: even after the IPC arrives, because we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists. This patch takes the approach of asking the renderer to retry the request without the SW, much like the "fallback for CORS" mechanism. BUG=561988 ==========
falken@chromium.org changed reviewers: + clamy@chromium.org, horo@chromium.org
clamy, horo: I'd like to get early feedback on this WIP patch before going too far down this path. I'm especially not sure about the ASSERTs it's touching. Any ideas about an alternative approach would also be appreciated. It seems ideally the browser process should just say "stop this resource load, start completely over with a new one without the SW" if possible. https://codereview.chromium.org/1478103002/diff/1/content/browser/frame_host/... File content/browser/frame_host/navigation_handle_impl.cc (right): https://codereview.chromium.org/1478103002/diff/1/content/browser/frame_host/... content/browser/frame_host/navigation_handle_impl.cc:309: // DCHECK(!render_frame_host_); clamy: How bad is disabling this DCHECK? https://codereview.chromium.org/1478103002/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/fetch/ResourceLoader.cpp (right): https://codereview.chromium.org/1478103002/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/fetch/ResourceLoader.cpp:352: // ASSERT(!m_request.skipServiceWorker()); This ASSERT is probably meaning to say "we must not have already tried to fall back". However, skipServiceWorker is set for any redirect in WebURLLoaderImpl::PopulateURLRequestForRedirect. This is actually misleading because it looks like the skip flag doesn't have an affect for main resource redirects: it's only looked at in ResourceHostMsg_RequestResource (which subresource redirects use) but not used in ResourceHostMsg_FollowRedirect (which main resource redirects use).
It turns out this is more complex than needed. I just have to fix the setController bug. Since the SetController IPC comes before the main resource response, the context will set the controller the null before creating the DocumentThreadableLoader, so it knows there is no service worker controller by the time it begins subresource loads.
Description was changed from ========== [service worker] Fix detach controller and fallback to network mechanism Before this patch, we tried to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer while retrying the request on the browser side without the SW. This is flawed because: - It's racy: the renderer still thinks it's controlled by the service worker until the IPC arrives - It's buggy: even after the IPC arrives, because we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists. This patch takes the approach of asking the renderer to retry the request without the SW, much like the "fallback for CORS" mechanism. BUG=561988 ========== to ========== [service worker] Clear controller when set to invalid handle id We try to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer with a kInvalidServiceWorkerHandleId. Before this patch, since we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists, notably DocumentThreadableLoader thinks it has a controller, which eventually causes assert failures. This patch fixes things by setting the controller to null. BUG=561988 ==========
falken@chromium.org changed reviewers: + nhiroki@chromium.org - clamy@chromium.org
updated patch and CL desc. nhiroki@ can you take a look?
The CQ bit was checked by falken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1478103002/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1478103002/20001
Thank you for working on this! https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... File content/child/service_worker/service_worker_dispatcher.cc (right): https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:693: provider->second->OnSetControllerServiceWorker(nullptr); How about throwing away invalid reference handle concept like this? 1. ServiceWorkerRegistrationHandleReference::Adopt() returns nullptr if |info.handle_id| is invalid, 2. GetOrCreateServiceWorker() returns nullptr if the given handle reference is nullptr. 3. GetOrCreateServiceWorker() (d)checks the given handle's id is valid.
https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... File content/child/service_worker/service_worker_dispatcher.cc (right): https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:692: if (info.handle_id == kInvalidServiceWorkerVersionId) Please add comment about when this happens. I think this happens only when failed to start SW while loading the main resource.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Reply only. https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... File content/child/service_worker/service_worker_dispatcher.cc (right): https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:692: if (info.handle_id == kInvalidServiceWorkerVersionId) On 2015/11/27 10:03:26, horo wrote: > Please add comment about when this happens. > I think this happens only when failed to start SW while loading the main > resource. We also call NotifyControllerLost when we fail to read the SW from disk and it had controllees (in SWReadFromDiskJob). This could happen if a SW started up for the main resource load, then the thread got terminated (as it can due to idle timeout), then later we try to wake it up (subresource request, push notification) and fail to read from disk. That detach is also probably buggy. I think we'll have to work on that in a followup patch. Maybe the only safe thing to do in that case is kill the renderer. https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:693: provider->second->OnSetControllerServiceWorker(nullptr); On 2015/11/27 09:46:01, nhiroki (ooo back on Dec. 4) wrote: > How about throwing away invalid reference handle concept like this? > > 1. ServiceWorkerRegistrationHandleReference::Adopt() returns nullptr if > |info.handle_id| is invalid, > 2. GetOrCreateServiceWorker() returns nullptr if the given handle reference is > nullptr. > 3. GetOrCreateServiceWorker() (d)checks the given handle's id is valid. Seems like a good idea, working on a patch to do this.
https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... File content/child/service_worker/service_worker_dispatcher.cc (right): https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:692: if (info.handle_id == kInvalidServiceWorkerVersionId) On 2015/12/01 03:46:06, falken wrote: > On 2015/11/27 10:03:26, horo wrote: > > Please add comment about when this happens. > > I think this happens only when failed to start SW while loading the main > > resource. > > We also call NotifyControllerLost when we fail to read the SW from disk and it > had controllees (in SWReadFromDiskJob). This could happen if a SW started up for > the main resource load, then the thread got terminated (as it can due to idle > timeout), then later we try to wake it up (subresource request, push > notification) and fail to read from disk. > > That detach is also probably buggy. I think we'll have to work on that in a > followup patch. Maybe the only safe thing to do in that case is kill the > renderer. Actually it happens in several other cases, looking at when ServiceWorkerMsg_SetControllerServiceWorker is sent. Some notable places are: (1: Registration failed ServiceWorkerProviderHost::OnRegistrationFailed ServiceWorkerProviderHost::DisassociateRegistration ServiceWorkerProviderHost::SetControllerVersionAttribute Send ServiceWorkerMsg_SetControllerServiceWorker (2: claim) ServiceWorkerProviderHost::ClaimedByRegistration ServiceWorkerProviderHost::DisassociateRegistration ditto (3: after following redirect?) ServiceWorkerControlleeRequestHandler::PrepareForMainResource ServiceWorkerProviderHost::DisassociateRegistration ditto I think I'll need to follow up on whether these are safe. At a glance, (1) shouldn't be necessary since we don't associate a registration until after registration succeeded (although we call OnRegistrationFailed in the SWReadFromDiskJob case above), (2) should be mostly OK since we call AssociateRegistration immediately after that (but there's a window of time for badness between the IPCs arriving), (3) should occur before the main resource load.
Updated patch as per comments, PTAL.
The CQ bit was checked by falken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1478103002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1478103002/40001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... File content/child/service_worker/service_worker_dispatcher.cc (right): https://codereview.chromium.org/1478103002/diff/20001/content/child/service_w... content/child/service_worker/service_worker_dispatcher.cc:692: if (info.handle_id == kInvalidServiceWorkerVersionId) On 2015/12/01 05:08:00, falken wrote: > On 2015/12/01 03:46:06, falken wrote: > > On 2015/11/27 10:03:26, horo wrote: > > > Please add comment about when this happens. > > > I think this happens only when failed to start SW while loading the main > > > resource. > > > > We also call NotifyControllerLost when we fail to read the SW from disk and it > > had controllees (in SWReadFromDiskJob). This could happen if a SW started up > for > > the main resource load, then the thread got terminated (as it can due to idle > > timeout), then later we try to wake it up (subresource request, push > > notification) and fail to read from disk. > > > > That detach is also probably buggy. I think we'll have to work on that in a > > followup patch. Maybe the only safe thing to do in that case is kill the > > renderer. > > Actually it happens in several other cases, looking at when > ServiceWorkerMsg_SetControllerServiceWorker is sent. Some notable places are: > > (1: Registration failed > ServiceWorkerProviderHost::OnRegistrationFailed > ServiceWorkerProviderHost::DisassociateRegistration > ServiceWorkerProviderHost::SetControllerVersionAttribute > Send ServiceWorkerMsg_SetControllerServiceWorker > > (2: claim) > ServiceWorkerProviderHost::ClaimedByRegistration > ServiceWorkerProviderHost::DisassociateRegistration > ditto > > (3: after following redirect?) > ServiceWorkerControlleeRequestHandler::PrepareForMainResource > ServiceWorkerProviderHost::DisassociateRegistration > ditto > > I think I'll need to follow up on whether these are safe. At a glance, (1) > shouldn't be necessary since we don't associate a registration until after > registration succeeded (although we call OnRegistrationFailed in the > SWReadFromDiskJob case above), (2) should be mostly OK since we call > AssociateRegistration immediately after that (but there's a window of time for > badness between the IPCs arriving), (3) should occur before the main resource > load. Thank you for the detailed explanation.
The CQ bit was checked by falken@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1478103002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1478103002/40001
Message was sent while issue was closed.
Description was changed from ========== [service worker] Clear controller when set to invalid handle id We try to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer with a kInvalidServiceWorkerHandleId. Before this patch, since we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists, notably DocumentThreadableLoader thinks it has a controller, which eventually causes assert failures. This patch fixes things by setting the controller to null. BUG=561988 ========== to ========== [service worker] Clear controller when set to invalid handle id We try to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer with a kInvalidServiceWorkerHandleId. Before this patch, since we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists, notably DocumentThreadableLoader thinks it has a controller, which eventually causes assert failures. This patch fixes things by setting the controller to null. BUG=561988 ==========
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
Description was changed from ========== [service worker] Clear controller when set to invalid handle id We try to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer with a kInvalidServiceWorkerHandleId. Before this patch, since we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists, notably DocumentThreadableLoader thinks it has a controller, which eventually causes assert failures. This patch fixes things by setting the controller to null. BUG=561988 ========== to ========== [service worker] Clear controller when set to invalid handle id We try to escape an unresponsive SW for main resource requests by sending a controller change IPC to the renderer with a kInvalidServiceWorkerHandleId. Before this patch, since we set the controller to an "invalid handle" rather than null, some code still thinks a controller exists, notably DocumentThreadableLoader thinks it has a controller, which eventually causes assert failures. This patch fixes things by setting the controller to null. BUG=561988 Committed: https://crrev.com/f326225e72d172ec369cb7d551f8c8b4b963e4d7 Cr-Commit-Position: refs/heads/master@{#362624} ==========
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/f326225e72d172ec369cb7d551f8c8b4b963e4d7 Cr-Commit-Position: refs/heads/master@{#362624} |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
