|
|
Created:
4 years, 3 months ago by haraken Modified:
3 years, 10 months ago Reviewers:
Primiano Tucci (use gerrit), jochen (gone - plz use gerrit), Torne, brucedawson, jam, dcheng CC:
chromium-reviews, darin-cc_chromium.org, mlamouri+watch-content_chromium.org Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionRemove RenderThreadImpl::Shutdown
RenderThreadImpl::Shutdown has been shutting down Blink and V8 gracefully,
but the graceful shutdown has caused tons of use-after-free bugs
(and many engineers has spent lots of time fixing ordering issues around the shutdown).
As discussed in chromium-dev@ (https://groups.google.com/a/chromium.org/d/topic/chromium-dev/OLS4JSZvowI/discussion) and
platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc12k91NTFk/discussion),
there is no reason we have to shut down the renderer gracefully.
It's just causing use-after-free bugs and wasting performance.
Hence this CL removes RenderThreadImpl::Shutdown and instead calls _exit(0) in ~ChildProcess.
We need a couple of special handling in a single-process mode. See the comment in ~ChildProcess
for more details.
BUG=639244
Review-Url: https://codereview.chromium.org/2309153002
Cr-Commit-Position: refs/heads/master@{#443175}
Committed: https://chromium.googlesource.com/chromium/src/+/bbfdd9f0669c9856883ffbf2cd9909e2a4df9dcf
Patch Set 1 #Patch Set 2 : temp #Patch Set 3 : temp #Patch Set 4 : temp #Patch Set 5 : temp #
Total comments: 1
Patch Set 6 : temp #Patch Set 7 : temp #Patch Set 8 : temp #Patch Set 9 : temp #Patch Set 10 : temp #Patch Set 11 : temp #Patch Set 12 : temp #Patch Set 13 : temp #Patch Set 14 : temp #Patch Set 15 : temp #Patch Set 16 : temp #Patch Set 17 : temp #Patch Set 18 : temp #Patch Set 19 : temp #Patch Set 20 : temp #Patch Set 21 : temp #Patch Set 22 : temp #Patch Set 23 : temp #Patch Set 24 : temp #Patch Set 25 : temp #Patch Set 26 : temp #Patch Set 27 : temp #Patch Set 28 : temp #Patch Set 29 : temp #Patch Set 30 : temp #Patch Set 31 : temp #Patch Set 32 : temp #Patch Set 33 : temp #Patch Set 34 : temp #Patch Set 35 : temp #Patch Set 36 : temp #Patch Set 37 : temp #Patch Set 38 : temp #Patch Set 39 : temp #Patch Set 40 : temp #Patch Set 41 : temp #Patch Set 42 : temp #Patch Set 43 : temp #Patch Set 44 : temp #Patch Set 45 : temp #Patch Set 46 : temp #Patch Set 47 : temp #Patch Set 48 : temp #Patch Set 49 : temp #Patch Set 50 : temp #Patch Set 51 : temp #Patch Set 52 : temp #Patch Set 53 : temp #Patch Set 54 : temp #Patch Set 55 : temp #Patch Set 56 : temp #Patch Set 57 : temp #Patch Set 58 : temp #Patch Set 59 : temp #Patch Set 60 : temp #
Total comments: 8
Patch Set 61 : temp #Patch Set 62 : temp #
Total comments: 2
Patch Set 63 : temp #
Total comments: 1
Patch Set 64 : temp #
Total comments: 1
Created: 3 years, 11 months ago
Messages
Total messages: 311 (235 generated)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_clang_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_clan...) chromeos_x86-generic_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_x86-ge...) chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...) mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
haraken@chromium.org changed reviewers: + primiano@chromium.org
primiano: FYI Only SingleProcessMemoryTracingTests that use WillOnce or WillRepeatedly are failing. It looks like that it's failing because TearDown() is not called, but I'm not sure why it's not called.
On 2016/09/08 09:02:44, haraken wrote: > primiano: FYI > > Only SingleProcessMemoryTracingTests that use WillOnce or WillRepeatedly are > failing. It looks like that it's failing because TearDown() is not called, but > I'm not sure why it's not called. Ok I think I know what the problem is: these are a --single-process tests, where the renderer and the browser live in the same process. What's happening here is that the test starts with the "browswer-side" main thread where all the expectations are set up and are supposed to be destroyed there. In single process mode, the render thread (which IIRC typically is the main thread of the renderer process) is created as a separate thread, via InProcessRendererThread. At some point InProcessRendererThread is destroyed, but too early w.r.t. the browser harness. That destruction stops the messageloop of the main thread, which reaches its base::ThreadMain() end, which causes the sequence: (epilogue of)base::ThreadMain() -> ~RenderProcessImpl dtor -> ~RenderProcessImpl dtor -> ~ChildProcess dtor -> RenderThreadImpl::Shutdown() -> exit() All this happens while the browser harness is in the middle of its shutdown: ShellBrowserMain() -> content::BrowserMainRunnerImpl::Shutdown() -> content::BrowserMainLoop::ShutdownThreadsAndCleanUp() -> content::RenderProcessHostImpl::ShutDownInProcessRenderer() -> (renderer thread... > exit() ) So effectively calling exit(0) here causes the browser-renderer pair (which live in the same process) to die into an intermediate state, where the browser-side hardness is not torn down, which gtest doesn't like. This is why you never see the call to TearDown(). I imagine that would have happened as part (or after) ShutdownThreadsAndCleanUp() if the exit() wasn't there. This should cause issues with all --single-process tests. The reality is that there aren't a lot of --single-process tests. The reason why this test was added was to cover webview cases. Isn't exit() on shutdown going to be too aggressive for single-process? P.S: loosk like other tests are failing on linux_android_rel_ng
haraken@chromium.org changed reviewers: + torne@chromium.org
On 2016/09/08 11:16:36, Primiano Tucci wrote: > On 2016/09/08 09:02:44, haraken wrote: > > primiano: FYI > > > > Only SingleProcessMemoryTracingTests that use WillOnce or WillRepeatedly are > > failing. It looks like that it's failing because TearDown() is not called, but > > I'm not sure why it's not called. > > Ok I think I know what the problem is: these are a --single-process tests, where > the renderer and the browser live in the same process. > What's happening here is that the test starts with the "browswer-side" main > thread where all the expectations are set up and are supposed to be destroyed > there. > In single process mode, the render thread (which IIRC typically is the main > thread of the renderer process) is created as a separate thread, via > InProcessRendererThread. > At some point InProcessRendererThread is destroyed, but too early w.r.t. the > browser harness. That destruction stops the messageloop of the main thread, > which reaches its base::ThreadMain() end, which causes the sequence: > (epilogue of)base::ThreadMain() -> ~RenderProcessImpl dtor -> ~RenderProcessImpl > dtor -> ~ChildProcess dtor -> RenderThreadImpl::Shutdown() -> exit() > > All this happens while the browser harness is in the middle of its shutdown: > ShellBrowserMain() -> content::BrowserMainRunnerImpl::Shutdown() -> > content::BrowserMainLoop::ShutdownThreadsAndCleanUp() -> > content::RenderProcessHostImpl::ShutDownInProcessRenderer() > -> (renderer thread... > exit() ) > > So effectively calling exit(0) here causes the browser-renderer pair (which live > in the same process) to die into an intermediate state, where the browser-side > hardness is not torn down, which gtest doesn't like. This is why you never see > the call to TearDown(). I imagine that would have happened as part (or after) > ShutdownThreadsAndCleanUp() if the exit() wasn't there. > > This should cause issues with all --single-process tests. The reality is that > there aren't a lot of --single-process tests. > The reason why this test was added was to cover webview cases. Isn't exit() on > shutdown going to be too aggressive for single-process? > > P.S: loosk like other tests are failing on linux_android_rel_ng Thank you very much primiano@ for the great hint! Now I understand what's going on there. torne@: Do you have any idea on how we should call exit(0) in RenderThreadImpl::Shutdown()? Background: At first I tried to just stop calling blink::shutdown in RenderThreadImpl::Shutdown(), but it didn't work because the following scenario can happen: 1) blink::shutdown is not called. Workers are still running. 2) The message loop and RenderThreadImpl are destructed. 3) The workers access the message loop or RenderThreadImpl and causes crash. To avoid that from happening, we need to forcibly kill the process in RenderThreadImpl::Shutdown().
On 2016/09/08 13:24:53, haraken wrote: > On 2016/09/08 11:16:36, Primiano Tucci wrote: > > On 2016/09/08 09:02:44, haraken wrote: > > > primiano: FYI > > > > > > Only SingleProcessMemoryTracingTests that use WillOnce or WillRepeatedly are > > > failing. It looks like that it's failing because TearDown() is not called, > but > > > I'm not sure why it's not called. > > > > Ok I think I know what the problem is: these are a --single-process tests, > where > > the renderer and the browser live in the same process. > > What's happening here is that the test starts with the "browswer-side" main > > thread where all the expectations are set up and are supposed to be destroyed > > there. > > In single process mode, the render thread (which IIRC typically is the main > > thread of the renderer process) is created as a separate thread, via > > InProcessRendererThread. > > At some point InProcessRendererThread is destroyed, but too early w.r.t. the > > browser harness. That destruction stops the messageloop of the main thread, > > which reaches its base::ThreadMain() end, which causes the sequence: > > (epilogue of)base::ThreadMain() -> ~RenderProcessImpl dtor -> > ~RenderProcessImpl > > dtor -> ~ChildProcess dtor -> RenderThreadImpl::Shutdown() -> exit() > > > > All this happens while the browser harness is in the middle of its shutdown: > > ShellBrowserMain() -> content::BrowserMainRunnerImpl::Shutdown() -> > > content::BrowserMainLoop::ShutdownThreadsAndCleanUp() -> > > content::RenderProcessHostImpl::ShutDownInProcessRenderer() > > -> (renderer thread... > exit() ) > > > > So effectively calling exit(0) here causes the browser-renderer pair (which > live > > in the same process) to die into an intermediate state, where the browser-side > > hardness is not torn down, which gtest doesn't like. This is why you never see > > the call to TearDown(). I imagine that would have happened as part (or after) > > ShutdownThreadsAndCleanUp() if the exit() wasn't there. > > > > This should cause issues with all --single-process tests. The reality is that > > there aren't a lot of --single-process tests. > > The reason why this test was added was to cover webview cases. Isn't exit() on > > shutdown going to be too aggressive for single-process? > > > > P.S: loosk like other tests are failing on linux_android_rel_ng > > Thank you very much primiano@ for the great hint! Now I understand what's going > on there. > > torne@: Do you have any idea on how we should call exit(0) in > RenderThreadImpl::Shutdown()? > > Background: At first I tried to just stop calling blink::shutdown in > RenderThreadImpl::Shutdown(), but it didn't work because the following scenario > can happen: > > 1) blink::shutdown is not called. Workers are still running. > 2) The message loop and RenderThreadImpl are destructed. > 3) The workers access the message loop or RenderThreadImpl and causes crash. > > To avoid that from happening, we need to forcibly kill the process in > RenderThreadImpl::Shutdown(). I'm pretty sure you simply cannot do this in single-process mode. The browser isn't ready to exit yet, and exiting it early is likely to cause tests to not work properly.
https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... File content/renderer/render_thread_impl.cc (right): https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... content/renderer/render_thread_impl.cc:865: exit(0); One more comment while I think of it: single-process mode aside, you almost certainly want _exit() here, not exit() (which will trigger atexit processing).
On 2016/09/13 15:08:08, Torne wrote: > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > File content/renderer/render_thread_impl.cc (right): > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > content/renderer/render_thread_impl.cc:865: exit(0); > One more comment while I think of it: single-process mode aside, you almost > certainly want _exit() here, not exit() (which will trigger atexit processing). If we call _exit(), will it solve the problem you mentioned in #26? Even if we call _exit(), it will kill the browser before it's ready to exit...
On 2016/09/15 12:42:40, haraken wrote: > On 2016/09/13 15:08:08, Torne wrote: > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > File content/renderer/render_thread_impl.cc (right): > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > content/renderer/render_thread_impl.cc:865: exit(0); > > One more comment while I think of it: single-process mode aside, you almost > > certainly want _exit() here, not exit() (which will trigger atexit > processing). > > If we call _exit(), will it solve the problem you mentioned in #26? Even if we > call _exit(), it will kill the browser before it's ready to exit... I meant that even in multiprocess mode this CL is not really right and should call _exit() instead. This will not help single process mode either way, though.
On 2016/09/15 13:11:32, Torne wrote: > On 2016/09/15 12:42:40, haraken wrote: > > On 2016/09/13 15:08:08, Torne wrote: > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > File content/renderer/render_thread_impl.cc (right): > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > content/renderer/render_thread_impl.cc:865: exit(0); > > > One more comment while I think of it: single-process mode aside, you almost > > > certainly want _exit() here, not exit() (which will trigger atexit > > processing). > > > > If we call _exit(), will it solve the problem you mentioned in #26? Even if we > > call _exit(), it will kill the browser before it's ready to exit... > > I meant that even in multiprocess mode this CL is not really right and should > call _exit() instead. This will not help single process mode either way, though. Thanks, makes sense. Just to confirm: What happens if we simply stop destructing RenderThreadImpl? (i.e., what happens if we simply leak RenderThreadImpl?) I guess it will be problematic because the single-process mode creates multiple RenderThreadImpls in one process over time, but I want to check if I'm understanding things correctly.
On 2016/09/15 14:03:30, haraken wrote: > On 2016/09/15 13:11:32, Torne wrote: > > On 2016/09/15 12:42:40, haraken wrote: > > > On 2016/09/13 15:08:08, Torne wrote: > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > File content/renderer/render_thread_impl.cc (right): > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > content/renderer/render_thread_impl.cc:865: exit(0); > > > > One more comment while I think of it: single-process mode aside, you > almost > > > > certainly want _exit() here, not exit() (which will trigger atexit > > > processing). > > > > > > If we call _exit(), will it solve the problem you mentioned in #26? Even if > we > > > call _exit(), it will kill the browser before it's ready to exit... > > > > I meant that even in multiprocess mode this CL is not really right and should > > call _exit() instead. This will not help single process mode either way, > though. > > Thanks, makes sense. > > Just to confirm: What happens if we simply stop destructing RenderThreadImpl? > (i.e., what happens if we simply leak RenderThreadImpl?) I guess it will be > problematic because the single-process mode creates multiple RenderThreadImpls > in one process over time, but I want to check if I'm understanding things > correctly. There should not be cases where single-process mode creates multiple RenderThreadImpls over time. The in-process render process/thread are never supposed to exit, and it's not safe for them to be destroyed and recreated (because blink::shutdown() being followed by blink:initialize() is forbidden). This does seem to actually be happening in practise in webview, but this is a bug and is causing assertion fail crashes (crbug.com/514141). I'm not sure what impact not destroying RenderThreadImpl will have - you'd have to see what invariants the single-process tests expect to hold true.
On 2016/09/15 14:23:45, Torne wrote: > On 2016/09/15 14:03:30, haraken wrote: > > On 2016/09/15 13:11:32, Torne wrote: > > > On 2016/09/15 12:42:40, haraken wrote: > > > > On 2016/09/13 15:08:08, Torne wrote: > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > File content/renderer/render_thread_impl.cc (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > content/renderer/render_thread_impl.cc:865: exit(0); > > > > > One more comment while I think of it: single-process mode aside, you > > almost > > > > > certainly want _exit() here, not exit() (which will trigger atexit > > > > processing). > > > > > > > > If we call _exit(), will it solve the problem you mentioned in #26? Even > if > > we > > > > call _exit(), it will kill the browser before it's ready to exit... > > > > > > I meant that even in multiprocess mode this CL is not really right and > should > > > call _exit() instead. This will not help single process mode either way, > > though. > > > > Thanks, makes sense. > > > > Just to confirm: What happens if we simply stop destructing RenderThreadImpl? > > (i.e., what happens if we simply leak RenderThreadImpl?) I guess it will be > > problematic because the single-process mode creates multiple RenderThreadImpls > > in one process over time, but I want to check if I'm understanding things > > correctly. > > There should not be cases where single-process mode creates multiple > RenderThreadImpls over time. The in-process render process/thread are never > supposed to exit Er, rather, are never supposed to exit unless the browser is also exiting, in which case a new one won't be created. > , and it's not safe for them to be destroyed and recreated > (because blink::shutdown() being followed by blink:initialize() is forbidden). > This does seem to actually be happening in practise in webview, but this is a > bug and is causing assertion fail crashes (crbug.com/514141). > > I'm not sure what impact not destroying RenderThreadImpl will have - you'd have > to see what invariants the single-process tests expect to hold true.
On 2016/09/15 14:24:16, Torne wrote: > On 2016/09/15 14:23:45, Torne wrote: > > On 2016/09/15 14:03:30, haraken wrote: > > > On 2016/09/15 13:11:32, Torne wrote: > > > > On 2016/09/15 12:42:40, haraken wrote: > > > > > On 2016/09/13 15:08:08, Torne wrote: > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > > File content/renderer/render_thread_impl.cc (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > > content/renderer/render_thread_impl.cc:865: exit(0); > > > > > > One more comment while I think of it: single-process mode aside, you > > > almost > > > > > > certainly want _exit() here, not exit() (which will trigger atexit > > > > > processing). > > > > > > > > > > If we call _exit(), will it solve the problem you mentioned in #26? Even > > if > > > we > > > > > call _exit(), it will kill the browser before it's ready to exit... > > > > > > > > I meant that even in multiprocess mode this CL is not really right and > > should > > > > call _exit() instead. This will not help single process mode either way, > > > though. > > > > > > Thanks, makes sense. > > > > > > Just to confirm: What happens if we simply stop destructing > RenderThreadImpl? > > > (i.e., what happens if we simply leak RenderThreadImpl?) I guess it will be > > > problematic because the single-process mode creates multiple > RenderThreadImpls > > > in one process over time, but I want to check if I'm understanding things > > > correctly. > > > > There should not be cases where single-process mode creates multiple > > RenderThreadImpls over time. The in-process render process/thread are never > > supposed to exit > > Er, rather, are never supposed to exit unless the browser is also exiting, in > which case a new one won't be created. > > > , and it's not safe for them to be destroyed and recreated > > (because blink::shutdown() being followed by blink:initialize() is forbidden). > > This does seem to actually be happening in practise in webview, but this is a > > bug and is causing assertion fail crashes (crbug.com/514141). > > > > I'm not sure what impact not destroying RenderThreadImpl will have - you'd > have > > to see what invariants the single-process tests expect to hold true. Then do you think it is an option to simply leak RenderThreadImpl?
On 2016/09/15 15:36:27, haraken wrote: > On 2016/09/15 14:24:16, Torne wrote: > > On 2016/09/15 14:23:45, Torne wrote: > > > On 2016/09/15 14:03:30, haraken wrote: > > > > On 2016/09/15 13:11:32, Torne wrote: > > > > > On 2016/09/15 12:42:40, haraken wrote: > > > > > > On 2016/09/13 15:08:08, Torne wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > > > File content/renderer/render_thread_impl.cc (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2309153002/diff/80001/content/renderer/render... > > > > > > > content/renderer/render_thread_impl.cc:865: exit(0); > > > > > > > One more comment while I think of it: single-process mode aside, you > > > > almost > > > > > > > certainly want _exit() here, not exit() (which will trigger atexit > > > > > > processing). > > > > > > > > > > > > If we call _exit(), will it solve the problem you mentioned in #26? > Even > > > if > > > > we > > > > > > call _exit(), it will kill the browser before it's ready to exit... > > > > > > > > > > I meant that even in multiprocess mode this CL is not really right and > > > should > > > > > call _exit() instead. This will not help single process mode either way, > > > > though. > > > > > > > > Thanks, makes sense. > > > > > > > > Just to confirm: What happens if we simply stop destructing > > RenderThreadImpl? > > > > (i.e., what happens if we simply leak RenderThreadImpl?) I guess it will > be > > > > problematic because the single-process mode creates multiple > > RenderThreadImpls > > > > in one process over time, but I want to check if I'm understanding things > > > > correctly. > > > > > > There should not be cases where single-process mode creates multiple > > > RenderThreadImpls over time. The in-process render process/thread are never > > > supposed to exit > > > > Er, rather, are never supposed to exit unless the browser is also exiting, in > > which case a new one won't be created. > > > > > , and it's not safe for them to be destroyed and recreated > > > (because blink::shutdown() being followed by blink:initialize() is > forbidden). > > > This does seem to actually be happening in practise in webview, but this is > a > > > bug and is causing assertion fail crashes (crbug.com/514141). > > > > > > I'm not sure what impact not destroying RenderThreadImpl will have - you'd > > have > > > to see what invariants the single-process tests expect to hold true. > > Then do you think it is an option to simply leak RenderThreadImpl? If it doesn't break the assumptions of the tests that run in single process mode and actually do shut down the browser cleanly while in single-process, then it's probably okay to try it. It might cause weird behaviour when running the full browser in single-process mode on desktop, but that's not well-supported anyway. Android doesn't care one way or the other here because the browser never shuts down cleanly, and so the in-process renderer in single-process mode also isn't ever expected to shut down.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_...)
Description was changed from ========== Call exit at the end of RenderThreadImpl::Shutdown BUG= ========== to ========== Stop shutting down the renderer process (CL description will be added once we have agreed on the strategy.) BUG=639244 ==========
haraken@chromium.org changed reviewers: + dcheng@chromium.org, jam@chromium.org, jochen@chromium.org
jochen@, jam@, dcheng@, torne@: I'm working on removing a graceful shutdown sequence from the renderer process (more context: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). PS10 passes all tests, but what do you think about it? At first, I tried to simply remove blink::shutdown from the shutdown sequence but it broke a lot of things. If we simply remove blink::shutdown, we end up with destructing RenderThreadImpl and MessageLoop without stopping worker threads. Then the worker threads can crash when they post a task to an already-destructed MessageLoop. Next, I tried to forcibly call exit(0) at the beginning of RenderThreadImpl::Shutdown(). However, this didn't work well in a single-process mode (and browser tests) because it forcibly exits the process before the browser side is ready to exit. So, I'm now trying yet another approach in PS10: Don't call RenderThreadImpl::Shutdown() and leak RenderThreadImpl. To make it work, we also need to stop destructing Blink's thread-local storage and Chromium's LazyInstance when OnExit is called. I think this is fine because there is no reason we need to gracefully destruct LazyInstances, but I want to make sure if I'm going in a right direction. If the approach looks fine, I'll land a change to stop destructing LazyInstances when OnExit is called. Any thoughts?
On 2016/10/05 12:12:42, haraken wrote: > So, I'm now trying yet another approach in PS10: Don't call > RenderThreadImpl::Shutdown() and leak RenderThreadImpl. What actually happens when this is done? Does the thread continue running? Can it still accept posted tasks? Will it actually run them? It's really hard to evaluate whether this approach is sensible or not; I don't know enough about the global behaviour of shutdown. However, I do think this change all happening at once is too drastic. > To make it work, we also > need to stop destructing Blink's thread-local storage and Chromium's > LazyInstance when OnExit is called. I think this is fine because there is no > reason we need to gracefully destruct LazyInstances, but I want to make sure if > I'm going in a right direction. I'm not sure that there's no reason why we need to gracefully destruct LazyInstances. There's two variants of LazyInstance explicitly to allow the caller to make this decision. I think we could, reasonably, experiment with changing this behaviour, because it certainly seems *likely* that most LazyInstances could and should be leaky, but that should be a separate experiment by itself first, especially since a global change like you have here also affects the browser process, not just the renderer.
On 2016/10/05 12:12:42, haraken wrote: > jochen@, jam@, dcheng@, torne@: > > I'm working on removing a graceful shutdown sequence from the renderer process > (more context: > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > PS10 passes all tests, but what do you think about it? > > At first, I tried to simply remove blink::shutdown from the shutdown sequence > but it broke a lot of things. If we simply remove blink::shutdown, we end up > with destructing RenderThreadImpl and MessageLoop without stopping worker > threads. Then the worker threads can crash when they post a task to an > already-destructed MessageLoop. > > Next, I tried to forcibly call exit(0) at the beginning of > RenderThreadImpl::Shutdown(). However, this didn't work well in a single-process > mode (and browser tests) because it forcibly exits the process before the > browser side is ready to exit. Which patchset is this so I can look some more? i.e. I wonder if in the single-process case we can just leak (since this won't be reached in the only production use of single-process, webview). but then otherwise you just call exit()? > > So, I'm now trying yet another approach in PS10: Don't call > RenderThreadImpl::Shutdown() and leak RenderThreadImpl. To make it work, we also > need to stop destructing Blink's thread-local storage and Chromium's > LazyInstance when OnExit is called. I think this is fine because there is no > reason we need to gracefully destruct LazyInstances, but I want to make sure if > I'm going in a right direction. > > If the approach looks fine, I'll land a change to stop destructing LazyInstances > when OnExit is called. > > Any thoughts?
On 2016/10/05 15:05:00, Torne wrote: > On 2016/10/05 12:12:42, haraken wrote: > > So, I'm now trying yet another approach in PS10: Don't call > > RenderThreadImpl::Shutdown() and leak RenderThreadImpl. > > What actually happens when this is done? Does the thread continue running? Can > it still accept posted tasks? Will it actually run them? > > It's really hard to evaluate whether this approach is sensible or not; I don't > know enough about the global behaviour of shutdown. However, I do think this > change all happening at once is too drastic. > > > To make it work, we also > > need to stop destructing Blink's thread-local storage and Chromium's > > LazyInstance when OnExit is called. I think this is fine because there is no > > reason we need to gracefully destruct LazyInstances, but I want to make sure > if > > I'm going in a right direction. > > I'm not sure that there's no reason why we need to gracefully destruct > LazyInstances. There's two variants of LazyInstance explicitly to allow the > caller to make this decision. I think we could, reasonably, experiment with > changing this behaviour, because it certainly seems *likely* that most > LazyInstances could and should be leaky, but that should be a separate > experiment by itself first, especially since a global change like you have here > also affects the browser process, not just the renderer. I've vaguely heard (but not sure) that non-leaky LazyInstance is for unit testing. I'd support changing this default, but it might be rather involved and easier to do as a separate change.
jam@: On 2016/10/05 17:50:03, jam wrote: > On 2016/10/05 12:12:42, haraken wrote: > > jochen@, jam@, dcheng@, torne@: > > > > I'm working on removing a graceful shutdown sequence from the renderer process > > (more context: > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > PS10 passes all tests, but what do you think about it? > > > > At first, I tried to simply remove blink::shutdown from the shutdown sequence > > but it broke a lot of things. If we simply remove blink::shutdown, we end up > > with destructing RenderThreadImpl and MessageLoop without stopping worker > > threads. Then the worker threads can crash when they post a task to an > > already-destructed MessageLoop. > > > > Next, I tried to forcibly call exit(0) at the beginning of > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > single-process > > mode (and browser tests) because it forcibly exits the process before the > > browser side is ready to exit. > > Which patchset is this so I can look some more? For example, PS4. > > i.e. I wonder if in the single-process case we can just leak (since this won't > be reached in the only production use of single-process, webview). but then > otherwise you just call exit()? This is doable. I'm just curious but why do you prefer exit(0) than leaking on a multi-process mode? Shutdown overhead?
On 2016/10/05 21:22:08, dcheng wrote: > On 2016/10/05 15:05:00, Torne wrote: > > On 2016/10/05 12:12:42, haraken wrote: > > > So, I'm now trying yet another approach in PS10: Don't call > > > RenderThreadImpl::Shutdown() and leak RenderThreadImpl. > > > > What actually happens when this is done? Does the thread continue running? Can > > it still accept posted tasks? Will it actually run them? > > > > It's really hard to evaluate whether this approach is sensible or not; I don't > > know enough about the global behaviour of shutdown. However, I do think this > > change all happening at once is too drastic. > > > > > To make it work, we also > > > need to stop destructing Blink's thread-local storage and Chromium's > > > LazyInstance when OnExit is called. I think this is fine because there is no > > > reason we need to gracefully destruct LazyInstances, but I want to make sure > > if > > > I'm going in a right direction. > > > > I'm not sure that there's no reason why we need to gracefully destruct > > LazyInstances. There's two variants of LazyInstance explicitly to allow the > > caller to make this decision. I think we could, reasonably, experiment with > > changing this behaviour, because it certainly seems *likely* that most > > LazyInstances could and should be leaky, but that should be a separate > > experiment by itself first, especially since a global change like you have > here > > also affects the browser process, not just the renderer. > > I've vaguely heard (but not sure) that non-leaky LazyInstance is for unit > testing. I'd support changing this default, but it might be rather involved and > easier to do as a separate change. Of course, I'll land the change separately (if this is a sensible direction to go) :) PS10 is passing all tests. I added ANNOTATE_SCOPED_MEMORY_LEAK to LazyInstance::New to make it pass LSan.
On 2016/10/06 01:36:16, haraken wrote: > On 2016/10/05 21:22:08, dcheng wrote: > > On 2016/10/05 15:05:00, Torne wrote: > > > On 2016/10/05 12:12:42, haraken wrote: > > > > So, I'm now trying yet another approach in PS10: Don't call > > > > RenderThreadImpl::Shutdown() and leak RenderThreadImpl. > > > > > > What actually happens when this is done? Does the thread continue running? > Can > > > it still accept posted tasks? Will it actually run them? > > > > > > It's really hard to evaluate whether this approach is sensible or not; I > don't > > > know enough about the global behaviour of shutdown. However, I do think this > > > change all happening at once is too drastic. > > > > > > > To make it work, we also > > > > need to stop destructing Blink's thread-local storage and Chromium's > > > > LazyInstance when OnExit is called. I think this is fine because there is > no > > > > reason we need to gracefully destruct LazyInstances, but I want to make > sure > > > if > > > > I'm going in a right direction. > > > > > > I'm not sure that there's no reason why we need to gracefully destruct > > > LazyInstances. There's two variants of LazyInstance explicitly to allow the > > > caller to make this decision. I think we could, reasonably, experiment with > > > changing this behaviour, because it certainly seems *likely* that most > > > LazyInstances could and should be leaky, but that should be a separate > > > experiment by itself first, especially since a global change like you have > > here > > > also affects the browser process, not just the renderer. > > > > I've vaguely heard (but not sure) that non-leaky LazyInstance is for unit > > testing. I'd support changing this default, but it might be rather involved > and > > easier to do as a separate change. Well, if we want to be sure we don't destroy things during OnExit then we need to actually remove non-leaky LazyInstances, not just change the default. > Of course, I'll land the change separately (if this is a sensible direction to > go) :) I think this should be discussed separately on chromium-dev - ask people if there's any reason why we actually need non-leaky LazyInstance.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Sorry I completely missed this reply till now. On 2016/10/06 01:34:39, haraken wrote: > jam@: > > On 2016/10/05 17:50:03, jam wrote: > > On 2016/10/05 12:12:42, haraken wrote: > > > jochen@, jam@, dcheng@, torne@: > > > > > > I'm working on removing a graceful shutdown sequence from the renderer > process > > > (more context: > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > PS10 passes all tests, but what do you think about it? > > > > > > At first, I tried to simply remove blink::shutdown from the shutdown > sequence > > > but it broke a lot of things. If we simply remove blink::shutdown, we end up > > > with destructing RenderThreadImpl and MessageLoop without stopping worker > > > threads. Then the worker threads can crash when they post a task to an > > > already-destructed MessageLoop. > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > single-process > > > mode (and browser tests) because it forcibly exits the process before the > > > browser side is ready to exit. > > > > Which patchset is this so I can look some more? > > For example, PS4. doh, unfortunately the try run data is gone. > > > > > i.e. I wonder if in the single-process case we can just leak (since this won't > > be reached in the only production use of single-process, webview). but then > > otherwise you just call exit()? > > This is doable. I'm just curious but why do you prefer exit(0) than leaking on a > multi-process mode? Shutdown overhead? I worry that starting to shut down some but not all the code will lead to awkward shut down semantics in destructors for various objects to deal with this.
On 2016/10/17 15:01:06, jam wrote: > Sorry I completely missed this reply till now. > > On 2016/10/06 01:34:39, haraken wrote: > > jam@: > > > > On 2016/10/05 17:50:03, jam wrote: > > > On 2016/10/05 12:12:42, haraken wrote: > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > I'm working on removing a graceful shutdown sequence from the renderer > > process > > > > (more context: > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > At first, I tried to simply remove blink::shutdown from the shutdown > > sequence > > > > but it broke a lot of things. If we simply remove blink::shutdown, we end > up > > > > with destructing RenderThreadImpl and MessageLoop without stopping worker > > > > threads. Then the worker threads can crash when they post a task to an > > > > already-destructed MessageLoop. > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > > single-process > > > > mode (and browser tests) because it forcibly exits the process before the > > > > browser side is ready to exit. > > > > > > Which patchset is this so I can look some more? > > > > For example, PS4. > > doh, unfortunately the try run data is gone. I can upload a new CL but what I did is just to add exit(0) at the beginning of RenderThreadImpl::Shutdown(). It works in a multi-process mode but doesn't work with a single-process mode because it suddenly exists the process before the browser side is ready to exit. Some browser tests were failing. > > > > > > > > > i.e. I wonder if in the single-process case we can just leak (since this > won't > > > be reached in the only production use of single-process, webview). but then > > > otherwise you just call exit()? > > > > This is doable. I'm just curious but why do you prefer exit(0) than leaking on > a > > multi-process mode? Shutdown overhead? > > I worry that starting to shut down some but not all the code will lead to > awkward shut down semantics in destructors for various objects to deal with > this. I'm still (slowly) working on this CL. In PS16, I'm seeing if we really need to make all LazyInstances leaky. I think it's generally a good direction to make all LazyInstances leaky but that wouldn't be a mandatory fix for removing a shutdown sequence -- I guess it would be enough to add ::Leaky to only a couple of LazyInstances. Let me ping you once the CL is ready :)
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
(deferring to John)
As a result of the attempt from PS13 - PS19, I've concluded that there're too many LazyInstances that must be marked as Leaky to remove the shutdown sequence. i.e., we'll need to make LazyInstances Leaky by default. I'll propose the change in chromium-dev@.
On 2016/10/17 15:08:29, haraken wrote: > On 2016/10/17 15:01:06, jam wrote: > > Sorry I completely missed this reply till now. > > > > On 2016/10/06 01:34:39, haraken wrote: > > > jam@: > > > > > > On 2016/10/05 17:50:03, jam wrote: > > > > On 2016/10/05 12:12:42, haraken wrote: > > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > > > I'm working on removing a graceful shutdown sequence from the renderer > > > process > > > > > (more context: > > > > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > > > At first, I tried to simply remove blink::shutdown from the shutdown > > > sequence > > > > > but it broke a lot of things. If we simply remove blink::shutdown, we > end > > up > > > > > with destructing RenderThreadImpl and MessageLoop without stopping > worker > > > > > threads. Then the worker threads can crash when they post a task to an > > > > > already-destructed MessageLoop. > > > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > > > single-process > > > > > mode (and browser tests) because it forcibly exits the process before > the > > > > > browser side is ready to exit. > > > > > > > > Which patchset is this so I can look some more? > > > > > > For example, PS4. > > > > doh, unfortunately the try run data is gone. > > I can upload a new CL but what I did is just to add exit(0) at the beginning of > RenderThreadImpl::Shutdown(). > > It works in a multi-process mode but doesn't work with a single-process mode > because it suddenly exists the process before the browser side is ready to exit. > Some browser tests were failing. There are only a very small number of browser tests that run with single process right? we can maybe write these in other ways and then it wouldn't be a problem to just exit()? Which patchset has this? and for android_webview, which is the only production use of single process, that never exits so it's not an issue. > > > > > > > > > > > > > > i.e. I wonder if in the single-process case we can just leak (since this > > won't > > > > be reached in the only production use of single-process, webview). but > then > > > > otherwise you just call exit()? > > > > > > This is doable. I'm just curious but why do you prefer exit(0) than leaking > on > > a > > > multi-process mode? Shutdown overhead? > > > > I worry that starting to shut down some but not all the code will lead to > > awkward shut down semantics in destructors for various objects to deal with > > this. > > I'm still (slowly) working on this CL. In PS16, I'm seeing if we really need to > make all LazyInstances leaky. I think it's generally a good direction to make > all LazyInstances leaky but that wouldn't be a mandatory fix for removing a > shutdown sequence -- I guess it would be enough to add ::Leaky to only a couple > of LazyInstances. > > Let me ping you once the CL is ready :)
On 2016/10/26 18:01:04, jam wrote: > On 2016/10/17 15:08:29, haraken wrote: > > On 2016/10/17 15:01:06, jam wrote: > > > Sorry I completely missed this reply till now. > > > > > > On 2016/10/06 01:34:39, haraken wrote: > > > > jam@: > > > > > > > > On 2016/10/05 17:50:03, jam wrote: > > > > > On 2016/10/05 12:12:42, haraken wrote: > > > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > > > > > I'm working on removing a graceful shutdown sequence from the renderer > > > > process > > > > > > (more context: > > > > > > > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > > > > > At first, I tried to simply remove blink::shutdown from the shutdown > > > > sequence > > > > > > but it broke a lot of things. If we simply remove blink::shutdown, we > > end > > > up > > > > > > with destructing RenderThreadImpl and MessageLoop without stopping > > worker > > > > > > threads. Then the worker threads can crash when they post a task to an > > > > > > already-destructed MessageLoop. > > > > > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > > > > single-process > > > > > > mode (and browser tests) because it forcibly exits the process before > > the > > > > > > browser side is ready to exit. > > > > > > > > > > Which patchset is this so I can look some more? > > > > > > > > For example, PS4. > > > > > > doh, unfortunately the try run data is gone. > > > > I can upload a new CL but what I did is just to add exit(0) at the beginning > of > > RenderThreadImpl::Shutdown(). > > > > It works in a multi-process mode but doesn't work with a single-process mode > > because it suddenly exists the process before the browser side is ready to > exit. > > Some browser tests were failing. > > There are only a very small number of browser tests that run with single process > right? we can maybe write these in other ways and then it wouldn't be a problem > to just exit()? > Which patchset has this? The build log was lost, but I remember there were only a couple of browser tests failing. I agree that we should just rewrite them. > and for android_webview, which is the only production use of single process, > that never exits so it's not an issue. Makes sense, but what about a --single-process mode? Is it okay to crash it when the process is going to exit? (Sorry for duplicating the discussion between this thread and chromium-dev@. Let's discuss the details on this thread.)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
I think SingleProcessMemoryTracingTest is the only test running with the single process mode. We can just rewrite it.
On 2016/10/27 13:58:14, haraken wrote: > I think SingleProcessMemoryTracingTest is the only test running with the single > process mode. We can just rewrite it. primiano@: As far as I look at SingleProcessMemoryTracingTest, the test is testing if memory tracing works even in a single process mode. Do we really need those tests?
On 2016/10/27 14:47:54, haraken wrote: > On 2016/10/27 13:58:14, haraken wrote: > > I think SingleProcessMemoryTracingTest is the only test running with the > single > > process mode. We can just rewrite it. > > primiano@: As far as I look at SingleProcessMemoryTracingTest, the test is > testing if memory tracing works even in a single process mode. Do we really need > those tests? That is the only form of test coverage to check that tracing works for webview, which we regressed a lot of times, because nobody ever thinks to single process mode. What is the reason it's crashing? Maybe it's a matter of changing the initialization pattern here?
On 2016/10/27 15:41:10, Primiano Tucci wrote: > On 2016/10/27 14:47:54, haraken wrote: > > On 2016/10/27 13:58:14, haraken wrote: > > > I think SingleProcessMemoryTracingTest is the only test running with the > > single > > > process mode. We can just rewrite it. > > > > primiano@: As far as I look at SingleProcessMemoryTracingTest, the test is > > testing if memory tracing works even in a single process mode. Do we really > need > > those tests? > > That is the only form of test coverage to check that tracing works for webview, > which we regressed a lot of times, because nobody ever thinks to single process > mode. > What is the reason it's crashing? Maybe it's a matter of changing the > initialization pattern here? I'm adding exit(0) at the beginning of RenderThreadImpl::Shutdown(). Then the process exits before the browser process is ready to exit. Specifically, the process exists before TearDown() is called, then some objects leak, and the leak error fails the test: ../../content/browser/tracing/memory_tracing_browsertest.cc:249: ERROR: this mock object (used in test SingleProcessMemoryTracingTest.QueuedDumps) should be deleted but never is. Its address is @0x28603fb5c2d0. ../../content/browser/tracing/memory_tracing_browsertest.cc:253: ERROR: this mock object (used in test SingleProcessMemoryTracingTest.QueuedDumps) should be deleted but never is. Its address is @0x28603fb74ea0. ERROR: 2 leaked mock objects found at program exit. I'm trying if we can fix the leaks by destructing those objects at the end of each test or simply suppress the leaks.
On 2016/10/26 18:28:17, haraken wrote: > On 2016/10/26 18:01:04, jam wrote: > > On 2016/10/17 15:08:29, haraken wrote: > > > On 2016/10/17 15:01:06, jam wrote: > > > > Sorry I completely missed this reply till now. > > > > > > > > On 2016/10/06 01:34:39, haraken wrote: > > > > > jam@: > > > > > > > > > > On 2016/10/05 17:50:03, jam wrote: > > > > > > On 2016/10/05 12:12:42, haraken wrote: > > > > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > > > > > > > I'm working on removing a graceful shutdown sequence from the > renderer > > > > > process > > > > > > > (more context: > > > > > > > > > > > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > > > > > > > At first, I tried to simply remove blink::shutdown from the shutdown > > > > > sequence > > > > > > > but it broke a lot of things. If we simply remove blink::shutdown, > we > > > end > > > > up > > > > > > > with destructing RenderThreadImpl and MessageLoop without stopping > > > worker > > > > > > > threads. Then the worker threads can crash when they post a task to > an > > > > > > > already-destructed MessageLoop. > > > > > > > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > > > > > single-process > > > > > > > mode (and browser tests) because it forcibly exits the process > before > > > the > > > > > > > browser side is ready to exit. > > > > > > > > > > > > Which patchset is this so I can look some more? > > > > > > > > > > For example, PS4. > > > > > > > > doh, unfortunately the try run data is gone. > > > > > > I can upload a new CL but what I did is just to add exit(0) at the beginning > > of > > > RenderThreadImpl::Shutdown(). > > > > > > It works in a multi-process mode but doesn't work with a single-process mode > > > because it suddenly exists the process before the browser side is ready to > > exit. > > > Some browser tests were failing. > > > > There are only a very small number of browser tests that run with single > process > > right? we can maybe write these in other ways and then it wouldn't be a > problem > > to just exit()? > > Which patchset has this? > > The build log was lost, but I remember there were only a couple of browser tests > failing. I agree that we should just rewrite them. > > > and for android_webview, which is the only production use of single process, > > that never exits so it's not an issue. > > Makes sense, but what about a --single-process mode? Is it okay to crash it when > the process is going to exit? Right, this was never officially supported so it always had regressions. At least for the last few months, it's been broken in chrome. i.e. this is the crash that I get on chrome shutdown in single process mode: http://pastebin.com/tYJLB33q . Given the tradeoff of simplifying production renderer shutdown with your proposal, vs adding more complexity to support single process shutdown, I think we shouldn't start supporting the latter. btw one option could be that when running in --single-process mode, we exit() the entire process at shutdown. this would always give the "unclean shutdown" warning but that seems ok. > > (Sorry for duplicating the discussion between this thread and chromium-dev@. > Let's discuss the details on this thread.) On 2016/10/27 13:58:14, haraken wrote: > I think SingleProcessMemoryTracingTest is the only test running with the single > process mode. We can just rewrite it. There's a bunch more, mostly in content/renderer (and one in content/child): https://cs.chromium.org/search/?q=ksingleprocess+file:browsertest&sq=package:... I'm responsible for adding these, when I converted them as part of removing the old ui_test harness. These integration tests want to poke at code in the renderer, so they were converted to run in single process mode and then call PostTaskToInProcessRendererAndWait to execute the test body on the renderer thread. One way to convert them would be to move the test body to a mojo interface that's implemented in the renderer. The test code in the browser process can just make (async) mojo IPC calls to it and set a callback that would quit a nested message loop. see this test interface content/public/test/test_mojo_service.mojom which is implemented in content/shell/renderer/shell_content_renderer_client.cc
On 2016/10/27 17:01:03, jam wrote: > On 2016/10/26 18:28:17, haraken wrote: > > On 2016/10/26 18:01:04, jam wrote: > > > On 2016/10/17 15:08:29, haraken wrote: > > > > On 2016/10/17 15:01:06, jam wrote: > > > > > Sorry I completely missed this reply till now. > > > > > > > > > > On 2016/10/06 01:34:39, haraken wrote: > > > > > > jam@: > > > > > > > > > > > > On 2016/10/05 17:50:03, jam wrote: > > > > > > > On 2016/10/05 12:12:42, haraken wrote: > > > > > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > > > > > > > > > I'm working on removing a graceful shutdown sequence from the > > renderer > > > > > > process > > > > > > > > (more context: > > > > > > > > > > > > > > > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > > > > > > > > > At first, I tried to simply remove blink::shutdown from the > shutdown > > > > > > sequence > > > > > > > > but it broke a lot of things. If we simply remove blink::shutdown, > > we > > > > end > > > > > up > > > > > > > > with destructing RenderThreadImpl and MessageLoop without stopping > > > > worker > > > > > > > > threads. Then the worker threads can crash when they post a task > to > > an > > > > > > > > already-destructed MessageLoop. > > > > > > > > > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in a > > > > > > > single-process > > > > > > > > mode (and browser tests) because it forcibly exits the process > > before > > > > the > > > > > > > > browser side is ready to exit. > > > > > > > > > > > > > > Which patchset is this so I can look some more? > > > > > > > > > > > > For example, PS4. > > > > > > > > > > doh, unfortunately the try run data is gone. > > > > > > > > I can upload a new CL but what I did is just to add exit(0) at the > beginning > > > of > > > > RenderThreadImpl::Shutdown(). > > > > > > > > It works in a multi-process mode but doesn't work with a single-process > mode > > > > because it suddenly exists the process before the browser side is ready to > > > exit. > > > > Some browser tests were failing. > > > > > > There are only a very small number of browser tests that run with single > > process > > > right? we can maybe write these in other ways and then it wouldn't be a > > problem > > > to just exit()? > > > Which patchset has this? > > > > The build log was lost, but I remember there were only a couple of browser > tests > > failing. I agree that we should just rewrite them. > > > > > and for android_webview, which is the only production use of single process, > > > that never exits so it's not an issue. > > > > Makes sense, but what about a --single-process mode? Is it okay to crash it > when > > the process is going to exit? > > Right, this was never officially supported so it always had regressions. At > least for the last few months, it's been broken in chrome. i.e. this is the > crash that I get on chrome shutdown in single process mode: > http://pastebin.com/tYJLB33q . Given the tradeoff of simplifying production > renderer shutdown with your proposal, vs adding more complexity to support > single process shutdown, I think we shouldn't start supporting the latter. > > btw one option could be that when running in --single-process mode, we exit() > the entire process at shutdown. this would always give the "unclean shutdown" > warning but that seems ok. How is the option different from the proposal of calling exit(0) at the beginning of RenderThreadImpl::Shutdown? > > > > > > (Sorry for duplicating the discussion between this thread and chromium-dev@. > > Let's discuss the details on this thread.) > > > On 2016/10/27 13:58:14, haraken wrote: > > I think SingleProcessMemoryTracingTest is the only test running with the > single > > process mode. We can just rewrite it. > > There's a bunch more, mostly in content/renderer (and one in content/child): > https://cs.chromium.org/search/?q=ksingleprocess+file:browsertest&sq=package:... > I'm responsible for adding these, when I converted them as part of removing the > old ui_test harness. These integration tests want to poke at code in the > renderer, so they were converted to run in single process mode and then call > PostTaskToInProcessRendererAndWait to execute the test body on the renderer > thread. One way to convert them would be to move the test body to a mojo > interface that's implemented in the renderer. The test code in the browser > process can just make (async) mojo IPC calls to it and set a callback that would > quit a nested message loop. see this test interface > content/public/test/test_mojo_service.mojom which is implemented in > content/shell/renderer/shell_content_renderer_client.cc Ah, you're right. Although SingleProcessMemoryTracingTest is the only test that fails when we call exit(0) at RenderThreadImpl::Shutdown (i.e., PS20).
On 2016/10/27 20:29:36, haraken wrote: > On 2016/10/27 17:01:03, jam wrote: > > On 2016/10/26 18:28:17, haraken wrote: > > > On 2016/10/26 18:01:04, jam wrote: > > > > On 2016/10/17 15:08:29, haraken wrote: > > > > > On 2016/10/17 15:01:06, jam wrote: > > > > > > Sorry I completely missed this reply till now. > > > > > > > > > > > > On 2016/10/06 01:34:39, haraken wrote: > > > > > > > jam@: > > > > > > > > > > > > > > On 2016/10/05 17:50:03, jam wrote: > > > > > > > > On 2016/10/05 12:12:42, haraken wrote: > > > > > > > > > jochen@, jam@, dcheng@, torne@: > > > > > > > > > > > > > > > > > > I'm working on removing a graceful shutdown sequence from the > > > renderer > > > > > > > process > > > > > > > > > (more context: > > > > > > > > > > > > > > > > > > > > > > > > > https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8). > > > > > > > > > PS10 passes all tests, but what do you think about it? > > > > > > > > > > > > > > > > > > At first, I tried to simply remove blink::shutdown from the > > shutdown > > > > > > > sequence > > > > > > > > > but it broke a lot of things. If we simply remove > blink::shutdown, > > > we > > > > > end > > > > > > up > > > > > > > > > with destructing RenderThreadImpl and MessageLoop without > stopping > > > > > worker > > > > > > > > > threads. Then the worker threads can crash when they post a task > > to > > > an > > > > > > > > > already-destructed MessageLoop. > > > > > > > > > > > > > > > > > > Next, I tried to forcibly call exit(0) at the beginning of > > > > > > > > > RenderThreadImpl::Shutdown(). However, this didn't work well in > a > > > > > > > > single-process > > > > > > > > > mode (and browser tests) because it forcibly exits the process > > > before > > > > > the > > > > > > > > > browser side is ready to exit. > > > > > > > > > > > > > > > > Which patchset is this so I can look some more? > > > > > > > > > > > > > > For example, PS4. > > > > > > > > > > > > doh, unfortunately the try run data is gone. > > > > > > > > > > I can upload a new CL but what I did is just to add exit(0) at the > > beginning > > > > of > > > > > RenderThreadImpl::Shutdown(). > > > > > > > > > > It works in a multi-process mode but doesn't work with a single-process > > mode > > > > > because it suddenly exists the process before the browser side is ready > to > > > > exit. > > > > > Some browser tests were failing. > > > > > > > > There are only a very small number of browser tests that run with single > > > process > > > > right? we can maybe write these in other ways and then it wouldn't be a > > > problem > > > > to just exit()? > > > > Which patchset has this? > > > > > > The build log was lost, but I remember there were only a couple of browser > > tests > > > failing. I agree that we should just rewrite them. > > > > > > > and for android_webview, which is the only production use of single > process, > > > > that never exits so it's not an issue. > > > > > > Makes sense, but what about a --single-process mode? Is it okay to crash it > > when > > > the process is going to exit? > > > > Right, this was never officially supported so it always had regressions. At > > least for the last few months, it's been broken in chrome. i.e. this is the > > crash that I get on chrome shutdown in single process mode: > > http://pastebin.com/tYJLB33q . Given the tradeoff of simplifying production > > renderer shutdown with your proposal, vs adding more complexity to support > > single process shutdown, I think we shouldn't start supporting the latter. > > > > btw one option could be that when running in --single-process mode, we exit() > > the entire process at shutdown. this would always give the "unclean shutdown" > > warning but that seems ok. > > How is the option different from the proposal of calling exit(0) at the > beginning of RenderThreadImpl::Shutdown? I meant it would be part of the single process browser shutdown code (i.e. code in content/browser/) > > > > > > > > > > > (Sorry for duplicating the discussion between this thread and > chromium-dev@. > > > Let's discuss the details on this thread.) > > > > > > On 2016/10/27 13:58:14, haraken wrote: > > > I think SingleProcessMemoryTracingTest is the only test running with the > > single > > > process mode. We can just rewrite it. > > > > There's a bunch more, mostly in content/renderer (and one in content/child): > > > https://cs.chromium.org/search/?q=ksingleprocess+file:browsertest&sq=package:... > > I'm responsible for adding these, when I converted them as part of removing > the > > old ui_test harness. These integration tests want to poke at code in the > > renderer, so they were converted to run in single process mode and then call > > PostTaskToInProcessRendererAndWait to execute the test body on the renderer > > thread. One way to convert them would be to move the test body to a mojo > > interface that's implemented in the renderer. The test code in the browser > > process can just make (async) mojo IPC calls to it and set a callback that > would > > quit a nested message loop. see this test interface > > content/public/test/test_mojo_service.mojom which is implemented in > > content/shell/renderer/shell_content_renderer_client.cc > > Ah, you're right. Although SingleProcessMemoryTracingTest is the only test that > fails when we call exit(0) at RenderThreadImpl::Shutdown (i.e., PS20).
primiano@: As discussed in the chromium-dev test, would you mind rewriting the tracing test so that it works in a single-process mode? In practice, the tracing test is the only test that is failing with the exit(0) approach. Once we rewrite the test, I think we can make the change :)
On 2016/11/03 14:58:17, haraken wrote: > primiano@: As discussed in the chromium-dev test, would you mind rewriting the > tracing test so that it works in a single-process mode? In practice, the tracing > test is the only test that is failing with the exit(0) approach. Once we rewrite > the test, I think we can make the change :) This week I'm sheriffing, traveling, and I have a backlog of docs to write. If mine is the only test blocking this please go ahead, disable it and assign me a bug to re-enable that. Sorry, just tough week :/
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_compile_dbg_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_comp...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
(Sorry, I was busy with other stuff -- I'm now back on this CL.) torne@: PS24 passes all tests except content_browsertests on Android. Do you have any idea on why they're failing on Android? As far as I look at the test log, I cannot find anything suspicious.
All these tests are single-process mode tests, as far as I can see: https://cs.chromium.org/search/?q=AppendSwitch%5C(switches::kSingleProcess&sq... So, yes, if you call _exit(0) in renderer shutdown they will indeed fail ;)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
Of course this tests are single process tests and this are tested on http://best-essay-writing-service-reviews.com/ of best essay writing service reviews and I have tested that if the call reaches call _exit(0) it shuts down.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
Patchset #32 (id:620001) has been deleted
torne@: Hmm, I still cannot pass single-process-mode tests on linux_android_rel :/ Patch set 32 is doing the following thing: - If we're in a single-process mode, renderer doesn't call _exit(0). Instead, leak RenderThreadImpl. Then browser call _exit(0) at the very end of the shutdown sequence. I think it should work, but linux_android_rel is still failing: https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... There's no debug info but do you have any insight? Note: Browser needs to call _exit(0) at the end of the shutdown sequence to prevent LeakyInstances from getting destructed. We cannot destruct some of the LeakyInstances because we have not yet destructed RenderThreadImpl. Note: I'm looking for a solution that doesn't need to rewrite all single-process-mode tests to multi-process-mode. If there's no easy way, I'll do that.
On 2016/12/14 13:05:09, haraken wrote: > torne@: Hmm, I still cannot pass single-process-mode tests on linux_android_rel > :/ > > Patch set 32 is doing the following thing: > > - If we're in a single-process mode, renderer doesn't call _exit(0). Instead, > leak RenderThreadImpl. Then browser call _exit(0) at the very end of the > shutdown sequence. > > I think it should work, but linux_android_rel is still failing: > https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... > > There's no debug info but do you have any insight? > > Note: Browser needs to call _exit(0) at the end of the shutdown sequence to > prevent LeakyInstances from getting destructed. We cannot destruct some of the > LeakyInstances because we have not yet destructed RenderThreadImpl. > > Note: I'm looking for a solution that doesn't need to rewrite all > single-process-mode tests to multi-process-mode. If there's no easy way, I'll do > that. Browser tests shut the browser down *before* the test ends, so if you _exit(0) then the test will never complete. The browser test runner expects that BrowserMain() returns normally to the caller. Are you only testing this on Android intentionally? It seems like you could test these on Linux more easily.
On 2016/12/14 13:22:27, Torne wrote: > On 2016/12/14 13:05:09, haraken wrote: > > torne@: Hmm, I still cannot pass single-process-mode tests on > linux_android_rel > > :/ > > > > Patch set 32 is doing the following thing: > > > > - If we're in a single-process mode, renderer doesn't call _exit(0). Instead, > > leak RenderThreadImpl. Then browser call _exit(0) at the very end of the > > shutdown sequence. > > > > I think it should work, but linux_android_rel is still failing: > > > https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... > > > > There's no debug info but do you have any insight? > > > > Note: Browser needs to call _exit(0) at the end of the shutdown sequence to > > prevent LeakyInstances from getting destructed. We cannot destruct some of the > > LeakyInstances because we have not yet destructed RenderThreadImpl. > > > > Note: I'm looking for a solution that doesn't need to rewrite all > > single-process-mode tests to multi-process-mode. If there's no easy way, I'll > do > > that. > > Browser tests shut the browser down *before* the test ends, so if you _exit(0) > then the test will never complete. The browser test runner expects that > BrowserMain() returns normally to the caller. Thanks for the info! However, the CL is passing Linux, Mac, Windows... so I'm wondering why it fails only on linux_android_rel/dbg. > Are you only testing this on Android intentionally? It seems like you could test > these on Linux more easily. See Patch Set 31. The CL is passing on Linux (and on my own desktops). It is failing only on linux_android_rel/dbg.
On 2016/12/14 16:10:51, haraken wrote: > On 2016/12/14 13:22:27, Torne wrote: > > On 2016/12/14 13:05:09, haraken wrote: > > > torne@: Hmm, I still cannot pass single-process-mode tests on > > linux_android_rel > > > :/ > > > > > > Patch set 32 is doing the following thing: > > > > > > - If we're in a single-process mode, renderer doesn't call _exit(0). > Instead, > > > leak RenderThreadImpl. Then browser call _exit(0) at the very end of the > > > shutdown sequence. > > > > > > I think it should work, but linux_android_rel is still failing: > > > > > > https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... > > > > > > There's no debug info but do you have any insight? > > > > > > Note: Browser needs to call _exit(0) at the end of the shutdown sequence to > > > prevent LeakyInstances from getting destructed. We cannot destruct some of > the > > > LeakyInstances because we have not yet destructed RenderThreadImpl. > > > > > > Note: I'm looking for a solution that doesn't need to rewrite all > > > single-process-mode tests to multi-process-mode. If there's no easy way, > I'll > > do > > > that. > > > > Browser tests shut the browser down *before* the test ends, so if you _exit(0) > > then the test will never complete. The browser test runner expects that > > BrowserMain() returns normally to the caller. > > Thanks for the info! However, the CL is passing Linux, Mac, Windows... so I'm > wondering why it fails only on linux_android_rel/dbg. Android's browser test runner works differently. The other platforms don't handle it in the way I described and I'm not entirely clear on how they work. See https://cs.chromium.org/chromium/src/content/public/test/browser_test_base.cc... for the difference. > > Are you only testing this on Android intentionally? It seems like you could > test > > these on Linux more easily. > > See Patch Set 31. The CL is passing on Linux (and on my own desktops). It is > failing only on linux_android_rel/dbg. I'm surprised that's the case, but I guess I don't understand how the browser test runner works on the other platforms. You'll probably have to find out why these are different in the first place to understand if there's anything you can do about it.
On 2016/12/14 16:17:36, Torne wrote: > On 2016/12/14 16:10:51, haraken wrote: > > On 2016/12/14 13:22:27, Torne wrote: > > > On 2016/12/14 13:05:09, haraken wrote: > > > > torne@: Hmm, I still cannot pass single-process-mode tests on > > > linux_android_rel > > > > :/ > > > > > > > > Patch set 32 is doing the following thing: > > > > > > > > - If we're in a single-process mode, renderer doesn't call _exit(0). > > Instead, > > > > leak RenderThreadImpl. Then browser call _exit(0) at the very end of the > > > > shutdown sequence. > > > > > > > > I think it should work, but linux_android_rel is still failing: > > > > > > > > > > https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... > > > > > > > > There's no debug info but do you have any insight? > > > > > > > > Note: Browser needs to call _exit(0) at the end of the shutdown sequence > to > > > > prevent LeakyInstances from getting destructed. We cannot destruct some of > > the > > > > LeakyInstances because we have not yet destructed RenderThreadImpl. > > > > > > > > Note: I'm looking for a solution that doesn't need to rewrite all > > > > single-process-mode tests to multi-process-mode. If there's no easy way, > > I'll > > > do > > > > that. > > > > > > Browser tests shut the browser down *before* the test ends, so if you > _exit(0) > > > then the test will never complete. The browser test runner expects that > > > BrowserMain() returns normally to the caller. > > > > Thanks for the info! However, the CL is passing Linux, Mac, Windows... so I'm > > wondering why it fails only on linux_android_rel/dbg. > > Android's browser test runner works differently. The other platforms don't > handle it in the way I described and I'm not entirely clear on how they work. > See > https://cs.chromium.org/chromium/src/content/public/test/browser_test_base.cc... > for the difference. > > > > Are you only testing this on Android intentionally? It seems like you could > > test > > > these on Linux more easily. > > > > See Patch Set 31. The CL is passing on Linux (and on my own desktops). It is > > failing only on linux_android_rel/dbg. > > I'm surprised that's the case, but I guess I don't understand how the browser > test runner works on the other platforms. You'll probably have to find out why > these are different in the first place to understand if there's anything you can > do about it. Definitely! Thanks for the great hint :)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
torne@: See Patch Set 36. I moved _exit(0) to BrowserTestBase's destructor, but it is still failing only on android_linux (no useful log: https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...). Maybe Android is doing something meaningful in at_exit handlers?
On 2016/12/16 10:22:56, haraken wrote: > torne@: See Patch Set 36. I moved _exit(0) to BrowserTestBase's destructor, but > it is still failing only on android_linux (no useful log: > https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...). > > Maybe Android is doing something meaningful in at_exit handlers? I doubt it. I'm not sure what's up at this point; you'll probably have to debug yourself on an android device.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_android on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_a...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_clang on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_clang/builds/...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Yaaaaaaaaaaaaaaaaaaaaaay!!! The patch set 59 finally passed all tests on all bots :D I'll write a detailed comment and send the CL for review.
Had a quick look and this seems like a reasonable way to neuter this. Will do a proper review when you have the final CL. Good job persevering here :p
Description was changed from ========== Stop shutting down the renderer process (CL description will be added once we have agreed on the strategy.) BUG=639244 ========== to ========== Remove RenderThreadImpl::Shutdown RenderThreadImpl::Shutdown has been shutting down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in chromium-dev@ (https://groups.google.com/a/chromium.org/d/topic/chromium-dev/OLS4JSZvowI/dis...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence this CL removes RenderThreadImpl::Shutdown and instead calls _exit(0) in ~ChildProcess. We need a couple of special handling in a single-process mode. See the comment in ~ChildProcess for more details. BUG=639244 ==========
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
torne@: PTAL overall jam@: PTAL at content/ dcheng@: PTAL at base/
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_...)
some questions https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... File content/child/child_process.cc (right): https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:74: if (main_thread_->IsRenderThread()) { why not do this work in RenderThreadImpl::Shutdown()? https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:94: base::AtExitManager::DisableAllAtExitManagers(); won't that disable at exit managers for stuff that doesn't belong to this thread as well (since we're running in single process mode)?
https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... File content/child/child_process.cc (right): https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:74: if (main_thread_->IsRenderThread()) { On 2017/01/11 12:26:25, jochen wrote: > why not do this work in RenderThreadImpl::Shutdown()? If there's a good way to do that it seems like a good idea, but I'm not entirely sure; the other change here is that we leak the thread object instead of destroying it. https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:94: base::AtExitManager::DisableAllAtExitManagers(); On 2017/01/11 12:26:25, jochen wrote: > won't that disable at exit managers for stuff that doesn't belong to this thread > as well (since we're running in single process mode)? Yes. The assumption is that as the only shipping configuration for single process is webview on android (where shutdown of the browser process never happens at all as it only ever exits by SIGKILL from the OS), that this doesn't matter, as long as all the tests that run in single process mode still pass (which they appear to). https://codereview.chromium.org/2309153002/diff/1190001/content/renderer/rend... File content/renderer/render_thread_impl.cc (left): https://codereview.chromium.org/2309153002/diff/1190001/content/renderer/rend... content/renderer/render_thread_impl.cc:939: void RenderThreadImpl::Shutdown() { Should we keep the override of Shutdown and just have it state NOTREACHED() for documentation purposes?
https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... File content/child/child_process.cc (right): https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:74: if (main_thread_->IsRenderThread()) { On 2017/01/11 12:26:25, jochen wrote: > why not do this work in RenderThreadImpl::Shutdown()? Because we need to call main_thread_.release(). It needs to be done only on the ChildProcess side...? https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:94: base::AtExitManager::DisableAllAtExitManagers(); On 2017/01/11 12:26:25, jochen wrote: > won't that disable at exit managers for stuff that doesn't belong to this thread > as well (since we're running in single process mode)? You're right. Ideally we should stop calling at-exit managers only on this thread but currently there's no easy way to do that. Given the discussion on chromium-dev, it seems like a single-process mode is used only in: - a number of browser tests: we want to rewrite them in multi-process mode. - android webview: it never shuts down; i.e., it doesn't hit this call path. So I guess it would be okay to not worry about the at-exit manager problem too seriously here (as long as the existing single-process browser tests pass).
https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... File content/child/child_process.cc (right): https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... content/child/child_process.cc:74: if (main_thread_->IsRenderThread()) { On 2017/01/11 at 12:52:25, haraken wrote: > On 2017/01/11 12:26:25, jochen wrote: > > why not do this work in RenderThreadImpl::Shutdown()? > > Because we need to call main_thread_.release(). It needs to be done only on the ChildProcess side...? Shutdown() could return an enum indicating what to do?
On 2017/01/11 12:53:00, jochen wrote: > https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... > File content/child/child_process.cc (right): > > https://codereview.chromium.org/2309153002/diff/1190001/content/child/child_p... > content/child/child_process.cc:74: if (main_thread_->IsRenderThread()) { > On 2017/01/11 at 12:52:25, haraken wrote: > > On 2017/01/11 12:26:25, jochen wrote: > > > why not do this work in RenderThreadImpl::Shutdown()? > > > > Because we need to call main_thread_.release(). It needs to be done only on > the ChildProcess side...? > > Shutdown() could return an enum indicating what to do? I can do that, but do you really want to return ShouldLeakThisInstance from ChildThreadImpl::Shutdown?
i find it conceptually nicer than having the ChildThread know about the renderer thread
On 2017/01/11 13:01:28, jochen wrote: > i find it conceptually nicer than having the ChildThread know about the renderer > thread Well, we could just have ChildThread have a ShouldBeDestroyed method, if that's the concern?
On 2017/01/11 at 13:02:27, torne wrote: > On 2017/01/11 13:01:28, jochen wrote: > > i find it conceptually nicer than having the ChildThread know about the renderer > > thread > > Well, we could just have ChildThread have a ShouldBeDestroyed method, if that's the concern? yeah, that would also work
On 2017/01/11 13:07:16, jochen wrote: > On 2017/01/11 at 13:02:27, torne wrote: > > On 2017/01/11 13:01:28, jochen wrote: > > > i find it conceptually nicer than having the ChildThread know about the > renderer > > > thread > > > > Well, we could just have ChildThread have a ShouldBeDestroyed method, if > that's the concern? > > yeah, that would also work It won't work so nicely since we need to distinguish the three cases: a) render thread & single-process mode => leak main_thread_ b) render thread & multi-process mode => call _exit() c) other thread => call main_thread_->Shutdown() & main_thread_.reset()
On 2017/01/11 13:15:51, haraken wrote: > On 2017/01/11 13:07:16, jochen wrote: > > On 2017/01/11 at 13:02:27, torne wrote: > > > On 2017/01/11 13:01:28, jochen wrote: > > > > i find it conceptually nicer than having the ChildThread know about the > > renderer > > > > thread > > > > > > Well, we could just have ChildThread have a ShouldBeDestroyed method, if > > that's the concern? > > > > yeah, that would also work > > It won't work so nicely since we need to distinguish the three cases: > > a) render thread & single-process mode => leak main_thread_ > b) render thread & multi-process mode => call _exit() > c) other thread => call main_thread_->Shutdown() & main_thread_.reset() We could implement ChildThread::ShouldExitOrLeakOrReset but it would be weird...
On 2017/01/11 13:15:51, haraken wrote: > On 2017/01/11 13:07:16, jochen wrote: > > On 2017/01/11 at 13:02:27, torne wrote: > > > On 2017/01/11 13:01:28, jochen wrote: > > > > i find it conceptually nicer than having the ChildThread know about the > > renderer > > > > thread > > > > > > Well, we could just have ChildThread have a ShouldBeDestroyed method, if > > that's the concern? > > > > yeah, that would also work > > It won't work so nicely since we need to distinguish the three cases: > > a) render thread & single-process mode => leak main_thread_ > b) render thread & multi-process mode => call _exit() > c) other thread => call main_thread_->Shutdown() & main_thread_.reset() ChildProcess can call Shutdown, and then call ShouldBeDestroyed to decide whether to leak the thread or not. In the _exit case, RenderThreadImpl::Shutdown will already have called _exit and so it doesn't matter what ChildProcess does :)
On 2017/01/11 13:17:27, Torne wrote: > On 2017/01/11 13:15:51, haraken wrote: > > On 2017/01/11 13:07:16, jochen wrote: > > > On 2017/01/11 at 13:02:27, torne wrote: > > > > On 2017/01/11 13:01:28, jochen wrote: > > > > > i find it conceptually nicer than having the ChildThread know about the > > > renderer > > > > > thread > > > > > > > > Well, we could just have ChildThread have a ShouldBeDestroyed method, if > > > that's the concern? > > > > > > yeah, that would also work > > > > It won't work so nicely since we need to distinguish the three cases: > > > > a) render thread & single-process mode => leak main_thread_ > > b) render thread & multi-process mode => call _exit() > > c) other thread => call main_thread_->Shutdown() & main_thread_.reset() > > ChildProcess can call Shutdown, and then call ShouldBeDestroyed to decide > whether to leak the thread or not. In the _exit case, RenderThreadImpl::Shutdown > will already have called _exit and so it doesn't matter what ChildProcess does > :) Done.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
lgtm
LGTM, thanks, this seems a bit cleaner :)
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_...)
https://codereview.chromium.org/2309153002/diff/1230001/base/at_exit.cc File base/at_exit.cc (right): https://codereview.chromium.org/2309153002/diff/1230001/base/at_exit.cc#newco... base/at_exit.cc:85: if (!g_disable_managers) Is it important to pop the tasks like this? I wonder if we should just early return in the dtor?
https://codereview.chromium.org/2309153002/diff/1230001/base/at_exit.cc File base/at_exit.cc (right): https://codereview.chromium.org/2309153002/diff/1230001/base/at_exit.cc#newco... base/at_exit.cc:85: if (!g_disable_managers) On 2017/01/12 02:22:38, dcheng wrote: > Is it important to pop the tasks like this? I wonder if we should just early > return in the dtor? Fixed.
base LGTM with a nit https://codereview.chromium.org/2309153002/diff/1250001/base/at_exit.h File base/at_exit.h (right): https://codereview.chromium.org/2309153002/diff/1250001/base/at_exit.h#newcode53 base/at_exit.h:53: // process mode. See a comment in ~ChildProcess(). Nit: It would be nice to reword this comment in a way so it doesn't refer to code in //content (which //base can't use). Perhaps just omit the second half of the sentence.
On 2017/01/12 04:16:34, dcheng wrote: > base LGTM with a nit > > https://codereview.chromium.org/2309153002/diff/1250001/base/at_exit.h > File base/at_exit.h (right): > > https://codereview.chromium.org/2309153002/diff/1250001/base/at_exit.h#newcode53 > base/at_exit.h:53: // process mode. See a comment in ~ChildProcess(). > Nit: It would be nice to reword this comment in a way so it doesn't refer to > code in //content (which //base can't use). Perhaps just omit the second half of > the sentence. Done.
The CQ bit was checked by haraken@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from jochen@chromium.org, torne@chromium.org, dcheng@chromium.org Link to the patchset: https://codereview.chromium.org/2309153002/#ps1270001 (title: "temp")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by haraken@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 1270001, "attempt_start_ts": 1484203269359630, "parent_rev": "e3d63f666dd906c320f67c7286f2d9a9316c2ef5", "commit_rev": "bbfdd9f0669c9856883ffbf2cd9909e2a4df9dcf"}
Message was sent while issue was closed.
Description was changed from ========== Remove RenderThreadImpl::Shutdown RenderThreadImpl::Shutdown has been shutting down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in chromium-dev@ (https://groups.google.com/a/chromium.org/d/topic/chromium-dev/OLS4JSZvowI/dis...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence this CL removes RenderThreadImpl::Shutdown and instead calls _exit(0) in ~ChildProcess. We need a couple of special handling in a single-process mode. See the comment in ~ChildProcess for more details. BUG=639244 ========== to ========== Remove RenderThreadImpl::Shutdown RenderThreadImpl::Shutdown has been shutting down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in chromium-dev@ (https://groups.google.com/a/chromium.org/d/topic/chromium-dev/OLS4JSZvowI/dis...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence this CL removes RenderThreadImpl::Shutdown and instead calls _exit(0) in ~ChildProcess. We need a couple of special handling in a single-process mode. See the comment in ~ChildProcess for more details. BUG=639244 Review-Url: https://codereview.chromium.org/2309153002 Cr-Commit-Position: refs/heads/master@{#443175} Committed: https://chromium.googlesource.com/chromium/src/+/bbfdd9f0669c9856883ffbf2cd99... ==========
Message was sent while issue was closed.
Committed patchset #64 (id:1270001) as https://chromium.googlesource.com/chromium/src/+/bbfdd9f0669c9856883ffbf2cd99...
Message was sent while issue was closed.
primiano@chromium.org changed reviewers: + brucedawson@chromium.org
Message was sent while issue was closed.
+brucedawson (see comment below) https://codereview.chromium.org/2309153002/diff/1270001/content/renderer/rend... File content/renderer/render_thread_impl.cc (right): https://codereview.chromium.org/2309153002/diff/1270001/content/renderer/rend... content/renderer/render_thread_impl.cc:951: _exit(0); You might want to check this with +brucedawson. IIRC he mentioned in the past that on WIndows _exit(0) doesn't immediately quit but runs the CRT dtors and there is a "really exit now" API that you might want to use instead.
Message was sent while issue was closed.
Found the thread. See go/syuei and crbug.com/603131 . The api I was mentioning is TerminateProcess
Message was sent while issue was closed.
On 2017/01/12 11:18:34, Primiano Tucci (back but slow) wrote: > Found the thread. See go/syuei and crbug.com/603131 . The api I was mentioning > is TerminateProcess It's probably best to just use base::Process::Current().Terminate() ? That appears to do the right thing on each OS.. (it uses SIGTERM on posix, which is fine).
Message was sent while issue was closed.
On 2017/01/12 11:22:46, Torne wrote: > On 2017/01/12 11:18:34, Primiano Tucci (back but slow) wrote: > > Found the thread. See go/syuei and crbug.com/603131 . The api I was mentioning > > is TerminateProcess > > It's probably best to just use base::Process::Current().Terminate() ? That > appears to do the right thing on each OS.. (it uses SIGTERM on posix, which is > fine). I tried Process::Terminate in some of the patches but it didn't work because Process::Terminate can wait in some scenarios (some tests timed out). Maybe can we do: #if defined(OS_WIN) HANDLE handle = ::OpenProcess(PROCESS_ALL_ACCESS, 0, ::GetCurrentProcessId()); ::TerminateProcess(handle, 0); #else _exit(0); #endif ?
Message was sent while issue was closed.
On 2017/01/12 11:25:31, haraken wrote: > On 2017/01/12 11:22:46, Torne wrote: > > On 2017/01/12 11:18:34, Primiano Tucci (back but slow) wrote: > > > Found the thread. See go/syuei and crbug.com/603131 . The api I was > mentioning > > > is TerminateProcess > > > > It's probably best to just use base::Process::Current().Terminate() ? That > > appears to do the right thing on each OS.. (it uses SIGTERM on posix, which is > > fine). The problem is that: 1. test_launcher handles SIGTERM, where it writes to a pipe and, realistically expects the parent to kill the child (https://cs.chromium.org/chromium/src/base/test/launcher/test_launcher.cc?rcl=...) 2. Tsan handles sigterm: third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc > > I tried Process::Terminate in some of the patches but it didn't work because > Process::Terminate can wait in some scenarios (some tests timed out). > > Maybe can we do: > > #if defined(OS_WIN) > HANDLE handle = ::OpenProcess(PROCESS_ALL_ACCESS, 0, ::GetCurrentProcessId()); > ::TerminateProcess(handle, 0); > #else > _exit(0); > #endif > > ? Yeah, or maybe incapsulate it in //base/ into a KillSoonWIthoutRunningExitHandlers()
Message was sent while issue was closed.
The odd behavior of _exit() on Windows - where it calls destructors - is for debug builds only. The theory, I think, is that destructors should be called so that leak detection works, but doing that in the case of _exit() seems odd. In release builds _exit() calls TerminateProcess.
Message was sent while issue was closed.
This will make you feeling great with informative data. As you know the jericoacoara rentals providing the excellent places for making your dream trip. Visit more at http://www.jerihome.com/ |