|
|
Created:
5 years, 8 months ago by no sievers Modified:
5 years, 8 months ago CC:
chromium-reviews, darin-cc_chromium.org, jam Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFix deadlock in ChildProcessService.onDestroy()
onCreate() spawns the CrMain thread. onDestroy() can happen quickly
after onCreate() (for example if the browser crashes),
so we have to be careful about blocking on any semaphores
wrt the CrMain thread.
The existing cmdline check did not cover all of that.
In addition we have to avoid deadlocks between
nativeShutdownMainThread()and us actually not having ever
entered native main().
(ChildThreadImpl::Shutdown() on Android waits on a
condition variable.)
BUG=448549, 457406, 457914
NOTRY=True
Committed: https://crrev.com/b87d764ee43fb0f2f555f6519ced1e4fa8009518
Cr-Commit-Position: refs/heads/master@{#323567}
Patch Set 1 #
Total comments: 9
Patch Set 2 : don't throw runtime exception #Patch Set 3 : newline #Patch Set 4 : suppress findbugs #Messages
Total messages: 29 (13 generated)
sievers@chromium.org changed reviewers: + jdduke@chromium.org
ptal
Semaphore changes lgtm, though I may not be the best reviewer for this. +cc michaelbai. https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... File content/public/android/java/src/org/chromium/content/app/ChildProcessService.java (right): https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:129: throw new RuntimeException("Illegal child process reuse."); Under what circumstances does this happen? Is this only for bad actors, i.e., will throwing an exception allow a well-behaved browser process to find a "good" child process? https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:241: System.exit(0); Pardon my ignorance, is the explicit System.exit to force kill the process? https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:257: nativeShutdownMainThread(); So this should only get called if we've previously called ContentMain.start()?
michaelbai@chromium.org changed reviewers: + michaelbai@chromium.org
https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... File content/public/android/java/src/org/chromium/content/app/ChildProcessService.java (right): https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:129: throw new RuntimeException("Illegal child process reuse."); Did you suspect new service is created in the same Java VM? Does it only happen to GPU process? Will you mind to add some comments?
https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... File content/public/android/java/src/org/chromium/content/app/ChildProcessService.java (right): https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:129: throw new RuntimeException("Illegal child process reuse."); On 2015/04/02 17:29:47, michaelbai wrote: > Did you suspect new service is created in the same Java VM? Does it only happen > to GPU process? Will you mind to add some comments? I don't think there is any clear API or SDK contract on when a service gets relaunched (vs. reused). So I think it can happen for either, but looks like the Android internals are such that it happens more likely for non-sandboxed services. Throwing an exception (vs. logging an error) will give us more visibility for how frequent this is. https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:129: throw new RuntimeException("Illegal child process reuse."); On 2015/04/02 15:13:41, jdduke wrote: > Under what circumstances does this happen? Is this only for bad actors, i.e., > will throwing an exception allow a well-behaved browser process to find a "good" > child process? A crash would cause a sad tab for renderers, automatic recovery for GPU, and not sure for utility (it probably depends on the creator of the task, i.e. sandboxed image decode). I did see a bug though in ChildProcessLauncher, where if it crashes too early it would not notify the native side about the (failed) completion of the process launch task. So will have to deal with that regardless. https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:241: System.exit(0); On 2015/04/02 15:13:41, jdduke wrote: > Pardon my ignorance, is the explicit System.exit to force kill the process? Yes, I'm only adding it here for now because it's specifically likely that we reuse the GPU process if we hit this corner case. The sequence is: - GPU process crashes repeatedly (happens on pre-K devices that have an Android bug and cannot reattach to a Surface from a new process, which causes it to crash again) - the browser hits the successive crash limit - the browser compositor does not request a new context/process (because of the limit), but it's possible for a renderer to react to 'lost context' and quickly go to the browser causing it to spin up a new GPU process - this last process never gets fully established before the browser crashes and we come here Before this patch this leads to the deadlock (explained below). So this is the main issue fixed here. Crashing the browser seems to however always leave the stale process around, so we might reuse it when we relaunch the browser. https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:257: nativeShutdownMainThread(); On 2015/04/02 15:13:41, jdduke wrote: > So this should only get called if we've previously called ContentMain.start()? Yes, there are multiple deadlocks possible. Besides the ones that can happen if we wait on something to finish in onDestroy (like mLibraryInitialized) which might not happen if the browser went away, not going into the main() routine here deals with an Android-specific hack in child_thread_impl.cc: ChildThreadImpl::ShutdownThread() always waits for the child thread to spin up. So this semaphore also guarantees that we only call nativeShutdownMainThread if we actually entered native main.
https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... File content/public/android/java/src/org/chromium/content/app/ChildProcessService.java (right): https://codereview.chromium.org/1052133002/diff/1/content/public/android/java... content/public/android/java/src/org/chromium/content/app/ChildProcessService.java:129: throw new RuntimeException("Illegal child process reuse."); On 2015/04/02 17:50:47, sievers wrote: > On 2015/04/02 17:29:47, michaelbai wrote: > > Did you suspect new service is created in the same Java VM? Does it only > happen > > to GPU process? Will you mind to add some comments? > > I don't think there is any clear API or SDK contract on when a service gets > relaunched (vs. reused). So I think it can happen for either, but looks like the > Android internals are such that it happens more likely for non-sandboxed > services. > > Throwing an exception (vs. logging an error) will give us more visibility for > how frequent this is. Before your change, it seems render process will work fine even if the JVM is reused, will your change make unnecessary renderer crash?
> ht> Before your change, it seems render process will work fine even if the JVM is > reused, will your change make unnecessary renderer crash? I don't know how you can say that with confidence :) We should never have silently allowed that for Android. It breaks the basic assumptions of the security model. And as for as functionality goes: I mean... maybe it works, maybe not. It's not something that any other platform (as in Chrome-port) does. IMHO everything is built around the assumption that a ChildProcess/Thread gets a fresh process. For example, think about leaking statics. So if you wanted to support that behavior we should have made that official, and come up with how to embed that in the semantics and testing.
On 2015/04/02 18:31:44, sievers wrote: > > ht> Before your change, it seems render process will work fine even if the JVM > is > > reused, will your change make unnecessary renderer crash? > > I don't know how you can say that with confidence :) > > We should never have silently allowed that for Android. It breaks the basic > assumptions of the security model. > > And as for as functionality goes: > I mean... maybe it works, maybe not. It's not something that any other platform > (as in Chrome-port) does. > IMHO everything is built around the assumption that a ChildProcess/Thread gets a > fresh process. > For example, think about leaking statics. So if you wanted to support that > behavior we should have made that official, and come up with how to embed that > in the semantics and testing. I don't have such confidence :), can I assume Android will reuse the JVM ? If so, since chrome are not supposed to work on this, shall we just exit JVM instead of throwing exception?
On 2015/04/02 18:46:47, michaelbai wrote: > On 2015/04/02 18:31:44, sievers wrote: > > > ht> Before your change, it seems render process will work fine even if the > JVM > > is > > > reused, will your change make unnecessary renderer crash? > > > > I don't know how you can say that with confidence :) > > > > We should never have silently allowed that for Android. It breaks the basic > > assumptions of the security model. > > > > And as for as functionality goes: > > I mean... maybe it works, maybe not. It's not something that any other > platform > > (as in Chrome-port) does. > > IMHO everything is built around the assumption that a ChildProcess/Thread gets > a > > fresh process. > > For example, think about leaking statics. So if you wanted to support that > > behavior we should have made that official, and come up with how to embed that > > in the semantics and testing. > > > I don't have such confidence :), can I assume Android will reuse the JVM ? > If so, since chrome are not supposed to work on this, shall we just exit JVM > instead of throwing exception? Yes, but I wanted to do it from onDestroy() then (rather than from onCreate()). There is still another thread going on about if this is really the right thing to do, so I only added it for the corner case in onDestroy() for now (with a TODO). I mainly added the exception to get visibility of when this happens, given that the assumption so far was that 'it should not happen'.
On 2015/04/02 19:05:46, sievers wrote: > On 2015/04/02 18:46:47, michaelbai wrote: > > On 2015/04/02 18:31:44, sievers wrote: > > > > ht> Before your change, it seems render process will work fine even if the > > JVM > > > is > > > > reused, will your change make unnecessary renderer crash? > > > > > > I don't know how you can say that with confidence :) > > > > > > We should never have silently allowed that for Android. It breaks the basic > > > assumptions of the security model. > > > > > > And as for as functionality goes: > > > I mean... maybe it works, maybe not. It's not something that any other > > platform > > > (as in Chrome-port) does. > > > IMHO everything is built around the assumption that a ChildProcess/Thread > gets > > a > > > fresh process. > > > For example, think about leaking statics. So if you wanted to support that > > > behavior we should have made that official, and come up with how to embed > that > > > in the semantics and testing. > > > > > > I don't have such confidence :), can I assume Android will reuse the JVM ? > > If so, since chrome are not supposed to work on this, shall we just exit JVM > > instead of throwing exception? > > Yes, but I wanted to do it from onDestroy() then (rather than from onCreate()). > > There is still another thread going on about if this is really the right thing > to do, so I only added it for the corner case in onDestroy() for now (with a > TODO). > > I mainly added the exception to get visibility of when this happens, given that > the assumption so far was that 'it should not happen'. OK, Please add comments for throw exception, then lgtm though I think it is not necessary to throw exception.
On 2015/04/02 19:13:09, michaelbai wrote: > On 2015/04/02 19:05:46, sievers wrote: > > On 2015/04/02 18:46:47, michaelbai wrote: > > > On 2015/04/02 18:31:44, sievers wrote: > > > > > ht> Before your change, it seems render process will work fine even if > the > > > JVM > > > > is > > > > > reused, will your change make unnecessary renderer crash? > > > > > > > > I don't know how you can say that with confidence :) > > > > > > > > We should never have silently allowed that for Android. It breaks the > basic > > > > assumptions of the security model. > > > > > > > > And as for as functionality goes: > > > > I mean... maybe it works, maybe not. It's not something that any other > > > platform > > > > (as in Chrome-port) does. > > > > IMHO everything is built around the assumption that a ChildProcess/Thread > > gets > > > a > > > > fresh process. > > > > For example, think about leaking statics. So if you wanted to support that > > > > behavior we should have made that official, and come up with how to embed > > that > > > > in the semantics and testing. > > > > > > > > > I don't have such confidence :), can I assume Android will reuse the JVM ? > > > If so, since chrome are not supposed to work on this, shall we just exit JVM > > > instead of throwing exception? > > > > Yes, but I wanted to do it from onDestroy() then (rather than from > onCreate()). > > > > There is still another thread going on about if this is really the right thing > > to do, so I only added it for the corner case in onDestroy() for now (with a > > TODO). > > > > I mainly added the exception to get visibility of when this happens, given > that > > the assumption so far was that 'it should not happen'. > > > OK, Please add comments for throw exception, then lgtm though I think it is not > necessary to throw exception. Actually let me pull out the change to throw the exception into a separate patch. That'll also be better if we want to cherry-pick the deadlock fix.
The CQ bit was checked by sievers@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from michaelbai@chromium.org, jdduke@chromium.org Link to the patchset: https://codereview.chromium.org/1052133002/#ps40001 (title: "newline")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1052133002/40001
The CQ bit was unchecked by sievers@chromium.org
The CQ bit was checked by sievers@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1052133002/40001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: android_clang_dbg_recipe on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/android_clang_d...)
The CQ bit was checked by sievers@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from michaelbai@chromium.org, jdduke@chromium.org Link to the patchset: https://codereview.chromium.org/1052133002/#ps60001 (title: "suppress findbugs")
The CQ bit was unchecked by sievers@chromium.org
The CQ bit was checked by sievers@chromium.org
The CQ bit was unchecked by sievers@chromium.org
The CQ bit was checked by sievers@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1052133002/60001
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/b87d764ee43fb0f2f555f6519ced1e4fa8009518 Cr-Commit-Position: refs/heads/master@{#323567} |