|
|
Created:
7 years, 7 months ago by Philippe Modified:
7 years, 7 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, klobag.chromium, pasko-google - do not use Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionResume WebKit shared timers only in previously suspended processes.
WebKitPlatformSupportImpl (living in the renderer process) maintains a counter
of calls to ResumeSharedTimer() and SuspendSharedTimer(). ResumeSharedTimer()
naturally completes only if SuspendSharedTimer() was called previously. This
implies that the IPC triggering ResumeSharedTimer() must be sent only to the
renderer processes that were previously suspended as opposed to all the current
renderer processes.
BUG=b/8639231,b/8656786,225243
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=196825
Patch Set 1 #
Total comments: 6
Patch Set 2 : #Patch Set 3 : Address Jonathan's comments #Messages
Total messages: 13 (0 generated)
small suggestions... but lgtm https://codereview.chromium.org/14297022/diff/1/content/browser/android/conte... File content/browser/android/content_view_statics.cc (right): https://codereview.chromium.org/14297022/diff/1/content/browser/android/conte... content/browser/android/content_view_statics.cc:26: // ContentViewStatistics. and perhaps add "and not be build on top of WebKit::Platform::SuspendSharedTimer" - ISTM that this whole approach is misusing an interface that is intended for webkit to control the platform, not for the embedder to modify webkit policy, hence the fragility of it. (Writing a test maybe much easier if the two concepts are separated too) https://codereview.chromium.org/14297022/diff/1/content/browser/android/conte... content/browser/android/content_view_statics.cc:31: // suspended as opposed to all the current renderer processes. nit: append why it's important: "because the suspend calls are refcounted within WebKitPlatformSupport and it expects a perfectly matched number of resume calls" https://codereview.chromium.org/14297022/diff/1/content/browser/android/conte... content/browser/android/content_view_statics.cc:71: std::vector<int>* suspended_processes = &g_suspended_processes.Get(); foo = g_suspended_processes.Pointer()
Thanks Jonathan! https://chromiumcodereview.appspot.com/14297022/diff/1/content/browser/androi... File content/browser/android/content_view_statics.cc (right): https://chromiumcodereview.appspot.com/14297022/diff/1/content/browser/androi... content/browser/android/content_view_statics.cc:26: // ContentViewStatistics. On 2013/04/26 17:57:10, joth wrote: > and perhaps add "and not be build on top of > WebKit::Platform::SuspendSharedTimer" > - ISTM that this whole approach is misusing an interface that is intended for > webkit to control the platform, not for the embedder to modify webkit policy, > hence the fragility of it. > (Writing a test maybe much easier if the two concepts are separated too) Very good point. https://chromiumcodereview.appspot.com/14297022/diff/1/content/browser/androi... content/browser/android/content_view_statics.cc:31: // suspended as opposed to all the current renderer processes. On 2013/04/26 17:57:10, joth wrote: > nit: append why it's important: "because the suspend calls are refcounted within > WebKitPlatformSupport and it expects a perfectly matched number of resume calls" Done. https://chromiumcodereview.appspot.com/14297022/diff/1/content/browser/androi... content/browser/android/content_view_statics.cc:71: std::vector<int>* suspended_processes = &g_suspended_processes.Get(); On 2013/04/26 17:57:10, joth wrote: > foo = g_suspended_processes.Pointer() Ah, I always wondered why LazyInstance::Get() was returning a forbidden non-const reference but I see now :)
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pliard@chromium.org/14297022/8001
Step "update" is always a major failure. Look at the try server FAQ for more details. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=ios_dbg_si...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pliard@chromium.org/14297022/8001
On 2013/04/26 18:10:53, I haz the power (commit-bot) wrote: > CQ is trying da patch. Follow status at > https://chromium-status.appspot.com/cq/pliard%40chromium.org/14297022/8001 First, thank you so much for looking into this and upload the patch! I think this patch is a bit rushed, and think we could have lived with the problem for a few hours more. I am not sure if all the corner cases are looked into (what about sending 5 pauses and 1 resume? I also think it would be better to handle this in RenderThreadImpl::OnSetWebKitSharedTimersSuspended in content/renderer/render_thread_impl.cc
Sent from my Android On 26 Apr 2013 19:27, <kristianm@chromium.org> wrote: > > On 2013/04/26 18:10:53, I haz the power (commit-bot) wrote: >> >> CQ is trying da patch. Follow status at >> https://chromium-status.appspot.com/cq/pliard%40chromium.org/14297022/8001 > > > First, thank you so much for looking into this and upload the patch! > > I think this patch is a bit rushed, and think we could have lived with the > problem for a few hours more. I am not sure if all the corner cases are looked > into (what about sending 5 pauses and 1 resume? > I think this patch restores the behaviour to what it was prior to the refactoring, so no new bugs will be introduced and we've used that code since m18 I believe. But pretty late here in London/Paris, of you want to test/improve I'm sure patches are welcome :-) > I also think it would be better to handle this in > RenderThreadImpl::OnSetWebKitSharedTimersSuspended in > content/renderer/render_thread_impl.cc > > https://chromiumcodereview.appspot.com/14297022/
On 2013/04/26 18:27:08, Kristian Monsen wrote: > On 2013/04/26 18:10:53, I haz the power (commit-bot) wrote: > > CQ is trying da patch. Follow status at > > https://chromium-status.appspot.com/cq/pliard%2540chromium.org/14297022/8001 > > First, thank you so much for looking into this and upload the patch! > > I think this patch is a bit rushed, and think we could have lived with the > problem for a few hours more. I am not sure if all the corner cases are looked > into (what about sending 5 pauses and 1 resume? > > I also think it would be better to handle this in > RenderThreadImpl::OnSetWebKitSharedTimersSuspended in > content/renderer/render_thread_impl.cc I agree that handling this in OnSetWebKitSharedTimersSuspended() would be a better approach. We mainly wanted to revert the file (that we own by the way) to a well known working state. That being said, I don't think that with this fix there is any risk of calling pause more than resume (or the other way around) unless the client (which we are) is doing something wrong. I agree that this still sounds dangerous though. I would be happy to implement your approach which sounds better but this would not be before Monday since I'm in Europe :) I believe we need this fix pretty soon.
Message was sent while issue was closed.
Committed patchset #3 manually as r196825 (presubmit successful).
Message was sent while issue was closed.
lgtm (but I'm not at all familiar here..) if I'm understanding you all correctly, then perhaps instead of a "vector" this should be a "set" in order to cope with unbalanced suspend + resume calls? on suspend, if the process is not in the set, append there and suspend. on resume, traverse it all and then clear... again, I have no idea if this is actually possible, or if it's better to do a deeper cleanup as suggested above...
Message was sent while issue was closed.
On 2013/04/29 13:26:15, bulach wrote: > lgtm (but I'm not at all familiar here..) > > if I'm understanding you all correctly, then perhaps instead of a "vector" this > should be a "set" in order to cope with unbalanced suspend + resume calls? > > on suspend, if the process is not in the set, append there and suspend. > on resume, traverse it all and then clear... > > again, I have no idea if this is actually possible, or if it's better to do a > deeper cleanup as suggested above... Thanks Marcus. I will indeed do a deeper cleanup. Note that the vector here should be "safe" (I believe) since it is empty on suspend and two processes cannot have the same PID at the same time. So I believe there is no risk of suspending the same process twice. There might be another risk though (which will be addressed by the deeper cleanup): if a suspended renderer process gets killed (e.g. while Chrome is in background) and if a new renderer process gets created after that on resume with the same PID (which would be unfortunate :) then we would end up calling Resume() on this newly created renderer process which would cause exactly the same bug.
yeah I was wondering about set<> too, but then realized if anything did result in an erroneous double suspend then vector<> has the benefit of ensuring we'd send a double resume too (meaning the ref-counting will all balance out). (and, vector is what it used prior to the bug being introduced) But yeah... this is ripe for some cleanup to make it a bit more understandable! On 29 April 2013 06:35, <pliard@chromium.org> wrote: > On 2013/04/29 13:26:15, bulach wrote: > >> lgtm (but I'm not at all familiar here..) >> > > if I'm understanding you all correctly, then perhaps instead of a "vector" >> > this > >> should be a "set" in order to cope with unbalanced suspend + resume calls? >> > > on suspend, if the process is not in the set, append there and suspend. >> on resume, traverse it all and then clear... >> > > again, I have no idea if this is actually possible, or if it's better to >> do a >> deeper cleanup as suggested above... >> > > Thanks Marcus. > > I will indeed do a deeper cleanup. Note that the vector here should be > "safe" (I > believe) since it is empty on suspend and two processes cannot have the > same PID > at the same time. So I believe there is no risk of suspending the same > process > twice. There might be another risk though (which will be addressed by the > deeper > cleanup): if a suspended renderer process gets killed (e.g. while Chrome > is in > background) and if a new renderer process gets created after that on > resume > with the same PID (which would be unfortunate :) then we would end up > calling > Resume() on this newly created renderer process which would cause exactly > the > same bug. > > https://codereview.chromium.**org/14297022/<https://codereview.chromium.org/1... > |