|
|
Created:
4 years, 2 months ago by alexmos Modified:
4 years, 2 months ago CC:
chromium-reviews, mlamouri+watch-content_chromium.org, creis+watch_chromium.org, mlamouri+watch-blink_chromium.org, nasko+codewatch_chromium.org, jam, Randy Smith (Not in Mondays), dcheng, blink-reviews, darin-cc_chromium.org, loading-reviews_chromium.org, kinuko+watch, mmenke, Charlie Reis, site-isolation-reviews_chromium.org Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFix RenderView reuse issues when canceling a pending RenderFrameHost.
This CL fixes three issues that happen when a pending RenderFrameHost
is canceled and discarded, but its SiteInstance has other active
frames. In that case, the pending RenderFrameHost is replaced with a
proxy, so that other frames in the SiteInstance can still communicate
with it.
1) When a main frame pending RFH is canceled, we never update its
RVH's main_frame_routing_id() or is_active() status. If the RVH is
later reused by another main frame navigation, these parameters would
be stale, leading to a crash in CreateRenderView trying to resolve
the main frame via the stale routing ID. (Issue 627400)
2) The discarded RFH was deleted on the browser side, but the
corresponding RenderFrame wasn't deleted on the renderer side. That
led to a renderer-side crash of trying to reuse the leaked
RenderFrame's widget if we ever re-navigated to the same SiteInstance
in the same frame. The fix is to send a FrameMsg_Delete in the cases
where the discarded main frame isn't going to be cleaned up by its RVH
going away (in the case where the proxy is going to keep it
alive). (Issue 651980)
3) The replacement proxy's RenderFrameProxyHost was created, but the
RenderFrameProxy wasn't actually initialized on the renderer side, so
other frames couldn't actually communicate with this frame. (Issue
653746)
These changes are also likely to help with crashes we've been chasing
in issues 626802 and 575245.
BUG=627400, 651980, 653746, 626802, 575245
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation
Committed: https://crrev.com/78c9c0d7d8e45837bc93efababf7fe517b2b36ad
Cr-Commit-Position: refs/heads/master@{#425409}
Patch Set 1 #Patch Set 2 : Cleanup #Patch Set 3 : Rebase #Patch Set 4 : Fix compile #Patch Set 5 : Fix is_swapped_out() state on renderer side #Patch Set 6 : Nits #
Total comments: 18
Patch Set 7 : Address Nasko's and Charlie's feedback #Patch Set 8 : Rebase #
Messages
Total messages: 44 (37 generated)
Description was changed from ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. BUG=627400,651980,653746 ========== to ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. BUG=627400,651980,653746 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ==========
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...)
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. BUG=627400,651980,653746 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ========== to ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. This CL fixes three issues that happen when a pending RenderFrameHost is canceled and discarded, but its SiteInstance has other active frames. In that case, the pending RenderFrameHost is replaced with a proxy, so that other frames in the SiteInstance can still communicate with it. 1) When a main frame pending RFH is canceled, we never update its RVH's main_frame_routing_id() or is_active() status. If the RVH is later reused by another main frame navigation, these parameters would be stale, leading to a crash in CreateRenderView trying to resolve the main frame via the stale routing ID. (Issue 627400) 2) The discarded RFH was deleted on the browser side, but the corresponding RenderFrame wasn't deleted on the renderer side. That led to a renderer-side crash of trying to reuse the leaked RenderFrame's widget if we ever re-navigated to the same SiteInstance in the same frame. The fix is to send a FrameMsg_Delete in the cases where the discarded main frame isn't going to be cleaned up by its RVH going away (in the case where the proxy is going to keep it alive). (Issue 651980) 3) The replacement proxy's RenderFrameProxyHost was created, but the RenderFrameProxy wasn't actually initialized on the renderer side, so other frames couldn't actually communicate with this frame. (Issue 653746) BUG=627400,651980,653746 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ==========
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. This CL fixes three issues that happen when a pending RenderFrameHost is canceled and discarded, but its SiteInstance has other active frames. In that case, the pending RenderFrameHost is replaced with a proxy, so that other frames in the SiteInstance can still communicate with it. 1) When a main frame pending RFH is canceled, we never update its RVH's main_frame_routing_id() or is_active() status. If the RVH is later reused by another main frame navigation, these parameters would be stale, leading to a crash in CreateRenderView trying to resolve the main frame via the stale routing ID. (Issue 627400) 2) The discarded RFH was deleted on the browser side, but the corresponding RenderFrame wasn't deleted on the renderer side. That led to a renderer-side crash of trying to reuse the leaked RenderFrame's widget if we ever re-navigated to the same SiteInstance in the same frame. The fix is to send a FrameMsg_Delete in the cases where the discarded main frame isn't going to be cleaned up by its RVH going away (in the case where the proxy is going to keep it alive). (Issue 651980) 3) The replacement proxy's RenderFrameProxyHost was created, but the RenderFrameProxy wasn't actually initialized on the renderer side, so other frames couldn't actually communicate with this frame. (Issue 653746) BUG=627400,651980,653746 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ========== to ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. This CL fixes three issues that happen when a pending RenderFrameHost is canceled and discarded, but its SiteInstance has other active frames. In that case, the pending RenderFrameHost is replaced with a proxy, so that other frames in the SiteInstance can still communicate with it. 1) When a main frame pending RFH is canceled, we never update its RVH's main_frame_routing_id() or is_active() status. If the RVH is later reused by another main frame navigation, these parameters would be stale, leading to a crash in CreateRenderView trying to resolve the main frame via the stale routing ID. (Issue 627400) 2) The discarded RFH was deleted on the browser side, but the corresponding RenderFrame wasn't deleted on the renderer side. That led to a renderer-side crash of trying to reuse the leaked RenderFrame's widget if we ever re-navigated to the same SiteInstance in the same frame. The fix is to send a FrameMsg_Delete in the cases where the discarded main frame isn't going to be cleaned up by its RVH going away (in the case where the proxy is going to keep it alive). (Issue 651980) 3) The replacement proxy's RenderFrameProxyHost was created, but the RenderFrameProxy wasn't actually initialized on the renderer side, so other frames couldn't actually communicate with this frame. (Issue 653746) These changes are also likely to help with crashes we've been chasing in issues 626802 and 575245. BUG=627400,651980,653746,626802,575245 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ==========
alexmos@chromium.org changed reviewers: + nasko@chromium.org
Nasko, can you please take a look? This has all the fixes related to RVH reuse after discarding a pending RFH that has other active frames in its SiteInstance. Charlie: I'm also CC'ing you; I know you'll be busy for the next couple of days, but feel free to also take a look if you get a chance, since you've reviewed similar fixes in this area in the past. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_impl.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_impl.cc:350: frame_tree_node_->IsMainFrame() && render_view_host_->ref_count() == 1; Previously, the FrameMsg_Delete wasn't getting sent when deleting a main frame and relying on the RVH to clean it up. But in the case that pending RFH has other active frames in its SiteInstance, DiscardUnusedFrame is creating a RFPH in its place that reused that RVH, so the RVH wasn't going away, and we leaked the main frame RF. I'm capturing this here by checking if this frame is the last one holding on to this RVH. (Checking active_frame_count() == 0 would also work, I think.) https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_manager.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:679: rvh->set_is_active(false); Doing this is important in the case where the replacement proxy is created above; the RVH should be considered swapped out after this point. When there's no proxy created, this doesn't really matter, as this RVH will be destroyed shortly (since render_frame_host is its last active frame). https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:689: proxy->InitRenderFrameProxy(); I think it makes more sense to initialize the proxy after sending FrameMsg_Delete as part of destroying |render_frame_host| above. Doing it beforehand would make Blink think that we're creating a proxy and putting it into the tree when there's already a RenderFrame in place there, and without going through OnSwapOut. This seems weird, though in practice, the only thing that broke when I tried that was a check in DevTools when I had it open (DeviceOrientationInspectorAgent::controller assuming the page's mainFrame() is local) when subsequently destroying the RF.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
LGTM with a few nits. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_impl.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_impl.cc:350: frame_tree_node_->IsMainFrame() && render_view_host_->ref_count() == 1; On 2016/10/12 20:29:46, alexmos wrote: > Previously, the FrameMsg_Delete wasn't getting sent when deleting a main frame > and relying on the RVH to clean it up. But in the case that pending RFH has > other active frames in its SiteInstance, DiscardUnusedFrame is creating a RFPH > in its place that reused that RVH, so the RVH wasn't going away, and we leaked > the main frame RF. I'm capturing this here by checking if this frame is the > last one holding on to this RVH. (Checking active_frame_count() == 0 would also > work, I think.) I like the more direct test for it - the ref count on the RVH. It is clear exactly what it means. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_impl.cc:352: render_frame_created_) { nit: I'd put render_frame_created as the second bool to test, as it makes it a bit easier to follow. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_manager.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:679: rvh->set_is_active(false); On 2016/10/12 20:29:46, alexmos wrote: > Doing this is important in the case where the replacement proxy is created > above; the RVH should be considered swapped out after this point. When there's > no proxy created, this doesn't really matter, as this RVH will be destroyed > shortly (since render_frame_host is its last active frame). Acknowledged. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:689: proxy->InitRenderFrameProxy(); On 2016/10/12 20:29:46, alexmos wrote: > I think it makes more sense to initialize the proxy after sending > FrameMsg_Delete as part of destroying |render_frame_host| above. Doing it > beforehand would make Blink think that we're creating a proxy and putting it > into the tree when there's already a RenderFrame in place there, and without > going through OnSwapOut. This seems weird, though in practice, the only thing > that broke when I tried that was a check in DevTools when I had it open > (DeviceOrientationInspectorAgent::controller assuming the page's mainFrame() is > local) when subsequently destroying the RF. Acknowledged. https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... content/browser/site_per_process_browsertest.cc:8390: // reuses the canceled RFH's RenderView), navigating to a main frame in the nit: RenderViewHost, since you mention RFH's. https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... content/browser/site_per_process_browsertest.cc:8392: // here are similar to the test above, but don't involve crashing the renderer. nit: replace "test above" with the explicit name of the test, since code can be moved around or another test be inserted between the two. https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... File content/renderer/render_widget.h (right): https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... content/renderer/render_widget.h:188: void SetSwappedOut(bool is_swapped_out); You should reorder the .cc file to match the header file ordering. https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... content/renderer/render_widget.h:190: bool is_swapped_out() { return is_swapped_out_; } rant: Ah! I can't wait to kill this boolean. It doesn't even belong on the widget to begin with.
creis@chromium.org changed reviewers: + creis@chromium.org
LGTM! Thanks so much for tracking these down. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_manager.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:679: rvh->set_is_active(false); On 2016/10/13 21:26:28, nasko (slow) wrote: > On 2016/10/12 20:29:46, alexmos wrote: > > Doing this is important in the case where the replacement proxy is created > > above; the RVH should be considered swapped out after this point. When > there's > > no proxy created, this doesn't really matter, as this RVH will be destroyed > > shortly (since render_frame_host is its last active frame). > > Acknowledged. Could you add this as a comment in the code?
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Thanks! All comments addressed below. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_impl.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_impl.cc:352: render_frame_created_) { On 2016/10/13 21:26:28, nasko (slow) wrote: > nit: I'd put render_frame_created as the second bool to test, as it makes it a > bit easier to follow. Done. https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... File content/browser/frame_host/render_frame_host_manager.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/frame_... content/browser/frame_host/render_frame_host_manager.cc:679: rvh->set_is_active(false); On 2016/10/14 16:25:10, Charlie Reis (Away till 10-14) wrote: > On 2016/10/13 21:26:28, nasko (slow) wrote: > > On 2016/10/12 20:29:46, alexmos wrote: > > > Doing this is important in the case where the replacement proxy is created > > > above; the RVH should be considered swapped out after this point. When > > there's > > > no proxy created, this doesn't really matter, as this RVH will be destroyed > > > shortly (since render_frame_host is its last active frame). > > > > Acknowledged. > > Could you add this as a comment in the code? Done. I also updated the (stale) comment on top of the if statement above to not talk about swapping out the pending RFH. https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... File content/browser/site_per_process_browsertest.cc (right): https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... content/browser/site_per_process_browsertest.cc:8390: // reuses the canceled RFH's RenderView), navigating to a main frame in the On 2016/10/13 21:26:28, nasko (slow) wrote: > nit: RenderViewHost, since you mention RFH's. Done. https://codereview.chromium.org/2410153005/diff/100001/content/browser/site_p... content/browser/site_per_process_browsertest.cc:8392: // here are similar to the test above, but don't involve crashing the renderer. On 2016/10/13 21:26:28, nasko (slow) wrote: > nit: replace "test above" with the explicit name of the test, since code can be > moved around or another test be inserted between the two. Done. https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... File content/renderer/render_widget.h (right): https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... content/renderer/render_widget.h:188: void SetSwappedOut(bool is_swapped_out); On 2016/10/13 21:26:28, nasko (slow) wrote: > You should reorder the .cc file to match the header file ordering. Done. (Though it's not perfect, as many other nearby functions are out of order.) https://codereview.chromium.org/2410153005/diff/100001/content/renderer/rende... content/renderer/render_widget.h:190: bool is_swapped_out() { return is_swapped_out_; } On 2016/10/13 21:26:28, nasko (slow) wrote: > rant: Ah! I can't wait to kill this boolean. It doesn't even belong on the > widget to begin with. Acknowledged.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by alexmos@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by alexmos@chromium.org
The CQ bit was checked by alexmos@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from creis@chromium.org, nasko@chromium.org Link to the patchset: https://codereview.chromium.org/2410153005/#ps140001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #8 (id:140001)
Message was sent while issue was closed.
Description was changed from ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. This CL fixes three issues that happen when a pending RenderFrameHost is canceled and discarded, but its SiteInstance has other active frames. In that case, the pending RenderFrameHost is replaced with a proxy, so that other frames in the SiteInstance can still communicate with it. 1) When a main frame pending RFH is canceled, we never update its RVH's main_frame_routing_id() or is_active() status. If the RVH is later reused by another main frame navigation, these parameters would be stale, leading to a crash in CreateRenderView trying to resolve the main frame via the stale routing ID. (Issue 627400) 2) The discarded RFH was deleted on the browser side, but the corresponding RenderFrame wasn't deleted on the renderer side. That led to a renderer-side crash of trying to reuse the leaked RenderFrame's widget if we ever re-navigated to the same SiteInstance in the same frame. The fix is to send a FrameMsg_Delete in the cases where the discarded main frame isn't going to be cleaned up by its RVH going away (in the case where the proxy is going to keep it alive). (Issue 651980) 3) The replacement proxy's RenderFrameProxyHost was created, but the RenderFrameProxy wasn't actually initialized on the renderer side, so other frames couldn't actually communicate with this frame. (Issue 653746) These changes are also likely to help with crashes we've been chasing in issues 626802 and 575245. BUG=627400,651980,653746,626802,575245 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation ========== to ========== Fix RenderView reuse issues when canceling a pending RenderFrameHost. This CL fixes three issues that happen when a pending RenderFrameHost is canceled and discarded, but its SiteInstance has other active frames. In that case, the pending RenderFrameHost is replaced with a proxy, so that other frames in the SiteInstance can still communicate with it. 1) When a main frame pending RFH is canceled, we never update its RVH's main_frame_routing_id() or is_active() status. If the RVH is later reused by another main frame navigation, these parameters would be stale, leading to a crash in CreateRenderView trying to resolve the main frame via the stale routing ID. (Issue 627400) 2) The discarded RFH was deleted on the browser side, but the corresponding RenderFrame wasn't deleted on the renderer side. That led to a renderer-side crash of trying to reuse the leaked RenderFrame's widget if we ever re-navigated to the same SiteInstance in the same frame. The fix is to send a FrameMsg_Delete in the cases where the discarded main frame isn't going to be cleaned up by its RVH going away (in the case where the proxy is going to keep it alive). (Issue 651980) 3) The replacement proxy's RenderFrameProxyHost was created, but the RenderFrameProxy wasn't actually initialized on the renderer side, so other frames couldn't actually communicate with this frame. (Issue 653746) These changes are also likely to help with crashes we've been chasing in issues 626802 and 575245. BUG=627400,651980,653746,626802,575245 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/78c9c0d7d8e45837bc93efababf7fe517b2b36ad Cr-Commit-Position: refs/heads/master@{#425409} ==========
Message was sent while issue was closed.
Patchset 8 (id:??) landed as https://crrev.com/78c9c0d7d8e45837bc93efababf7fe517b2b36ad Cr-Commit-Position: refs/heads/master@{#425409} |