|
|
Created:
4 years, 4 months ago by haraken Modified:
4 years, 4 months ago CC:
chromium-reviews, darin-cc_chromium.org, jam, mlamouri+watch-content_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionStop calling blink::shutdown
RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully,
but the graceful shutdown has caused tons of use-after-free bugs
(and many engineers has spent lots of time fixing ordering issues around the shutdown).
As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discussion)
and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc12k91NTFk/discussion),
there is no reason we have to shut down the renderer gracefully.
It's just causing use-after-free bugs and wasting performance.
Hence, this CL stops calling blink::shutdown, which had been
shutting down *some things* in Blink and V8 gracefully.
(Remember that blink::shutdown hadn't been shutting down everything;
a lot of objects in Blink and V8 had already been left as is without getting destructed.)
Ideally we should just call ProcessDied() at an earlier stage of
RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL.
BUG=639244
Committed: https://crrev.com/01cb51d26b17923ed2b5c3b59566f0fc9aed74ae
Cr-Commit-Position: refs/heads/master@{#413430}
Patch Set 1 #Patch Set 2 : temp #Patch Set 3 : temp #Patch Set 4 : temp #Patch Set 5 : temp #Patch Set 6 : temp #Patch Set 7 : temp #Patch Set 8 : temp #Patch Set 9 : temp #
Messages
Total messages: 58 (44 generated)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Exceeded global retry quota
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
haraken@chromium.org changed reviewers: + tzik@chromium.org
tzik@: Feel free to take over this CL if you want.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
PS5 passes tests except linux asan :) Getting close.
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromeos_x86-generic_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_x86-ge...)
The CQ bit was checked by haraken@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== Stop calling blink::shutdown BUG= ========== to ========== Stop calling blink::shutdown RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discus...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence, this CL stops calling blink::shutdown, which had been shutting down *some things* in Blink and V8 gracefully. (Remember that blink::shutdown hadn't been shutting down everything; a lot of objects in Blink and V8 had already been left as is without getting destructed.) Ideally we should just call ProcessDied() at an earlier stage of RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL. BUG=639244 ==========
haraken@chromium.org changed reviewers: + esprehn@chromium.org
Elliott: PTAL Now all tests pass. I'll send a PSA to blink-dev@ before landing.
torne@chromium.org changed reviewers: + torne@chromium.org
What's going to happen in single process mode? I've been trying to diagnose an issue where the single-process Android WebView somehow ends up shutting down the in-process renderer and then attempts to start it up again, resulting in a crash from the assertion failure in WTF::initialize that forbids re-initializing WTF in a process that has called WTF::shutdown. I've not yet been able to find the reason for this :( I would *guess* that the effect of your change here would be that we still crash in the second call to WTF::initialize because it also has an assert forbidding calling initialize twice? If so then that's probably not a huge deal. However, explicitly killing the process here in any way would be a problem, as that would no longer "crash" in a way we can get reports about...
> I would *guess* that the effect of your change here would be that we still crash > in the second call to WTF::initialize because it also has an assert forbidding > calling initialize twice? If so then that's probably not a huge deal. Correct. This CL won't change the behavior. On 2016/08/19 10:44:00, Torne wrote: > What's going to happen in single process mode? > > I've been trying to diagnose an issue where the single-process Android WebView > somehow ends up shutting down the in-process renderer and then attempts to start > it up again, resulting in a crash from the assertion failure in WTF::initialize > that forbids re-initializing WTF in a process that has called WTF::shutdown. > I've not yet been able to find the reason for this :( That sounds like a definite bug... Also, remember that we're already forcibly killing the renderer when IsFastShutdownPossible returns true. If that happens, you already cannot receive the crash report you want to get.
On 2016/08/19 10:58:16, haraken wrote: > > I would *guess* that the effect of your change here would be that we still > crash > > in the second call to WTF::initialize because it also has an assert forbidding > > calling initialize twice? If so then that's probably not a huge deal. > > Correct. This CL won't change the behavior. > > On 2016/08/19 10:44:00, Torne wrote: > > What's going to happen in single process mode? > > > > I've been trying to diagnose an issue where the single-process Android WebView > > somehow ends up shutting down the in-process renderer and then attempts to > start > > it up again, resulting in a crash from the assertion failure in > WTF::initialize > > that forbids re-initializing WTF in a process that has called WTF::shutdown. > > I've not yet been able to find the reason for this :( > > That sounds like a definite bug... Also, remember that we're already forcibly > killing the renderer when IsFastShutdownPossible returns true. If that happens, > you already cannot receive the crash report you want to get. Yes, it's a bug. We should never, ever enter the shutdown path in the first place. It doesn't make sense to shut down the in-process renderer unless you are also shutting down the browser process (at which point you would not then try to re-initialize the in-process renderer afterward). However, I've been unable to find the bug as of yet; if you have any ideas, please help ;) I would hope that IsFastShutdownPossible never returns true in single-process mode. That sounds like it would cause serious issues?
On 2016/08/19 11:01:00, Torne wrote: > On 2016/08/19 10:58:16, haraken wrote: > > > I would *guess* that the effect of your change here would be that we still > > crash > > > in the second call to WTF::initialize because it also has an assert > forbidding > > > calling initialize twice? If so then that's probably not a huge deal. > > > > Correct. This CL won't change the behavior. > > > > On 2016/08/19 10:44:00, Torne wrote: > > > What's going to happen in single process mode? > > > > > > I've been trying to diagnose an issue where the single-process Android > WebView > > > somehow ends up shutting down the in-process renderer and then attempts to > > start > > > it up again, resulting in a crash from the assertion failure in > > WTF::initialize > > > that forbids re-initializing WTF in a process that has called WTF::shutdown. > > > I've not yet been able to find the reason for this :( > > > > That sounds like a definite bug... Also, remember that we're already forcibly > > killing the renderer when IsFastShutdownPossible returns true. If that > happens, > > you already cannot receive the crash report you want to get. > > Yes, it's a bug. We should never, ever enter the shutdown path in the first > place. It doesn't make sense to shut down the in-process renderer unless you are > also shutting down the browser process (at which point you would not then try to > re-initialize the in-process renderer afterward). However, I've been unable to > find the bug as of yet; if you have any ideas, please help ;) > > I would hope that IsFastShutdownPossible never returns true in single-process > mode. That sounds like it would cause serious issues? I'm not familiar with how the single-thread mode is working, but maybe it's not entering the shutdown path (including IsFastShutdownPossible) in the first place, right?
On 2016/08/19 11:03:50, haraken wrote: > On 2016/08/19 11:01:00, Torne wrote: > > On 2016/08/19 10:58:16, haraken wrote: > > > > I would *guess* that the effect of your change here would be that we still > > > crash > > > > in the second call to WTF::initialize because it also has an assert > > forbidding > > > > calling initialize twice? If so then that's probably not a huge deal. > > > > > > Correct. This CL won't change the behavior. > > > > > > On 2016/08/19 10:44:00, Torne wrote: > > > > What's going to happen in single process mode? > > > > > > > > I've been trying to diagnose an issue where the single-process Android > > WebView > > > > somehow ends up shutting down the in-process renderer and then attempts to > > > start > > > > it up again, resulting in a crash from the assertion failure in > > > WTF::initialize > > > > that forbids re-initializing WTF in a process that has called > WTF::shutdown. > > > > I've not yet been able to find the reason for this :( > > > > > > That sounds like a definite bug... Also, remember that we're already > forcibly > > > killing the renderer when IsFastShutdownPossible returns true. If that > > happens, > > > you already cannot receive the crash report you want to get. > > > > Yes, it's a bug. We should never, ever enter the shutdown path in the first > > place. It doesn't make sense to shut down the in-process renderer unless you > are > > also shutting down the browser process (at which point you would not then try > to > > re-initialize the in-process renderer afterward). However, I've been unable to > > find the bug as of yet; if you have any ideas, please help ;) > > > > I would hope that IsFastShutdownPossible never returns true in single-process > > mode. That sounds like it would cause serious issues? > > I'm not familiar with how the single-thread mode is working, but maybe it's not > entering the shutdown path (including IsFastShutdownPossible) in the first > place, right? Single-process mode in general *does* enter the shutdown path when the browser is shutting down, and many tests that run in single-process mode rely on this and don't pass otherwise (I tried to entirely nuke shutting down the renderer in single-process to help debug this issue and couldn't submit the change as it breaks many tests). The Android WebView specifically doesn't *have* a browser shutdown path, though, and so the renderer shutdown path should never happen either.
haraken@chromium.org changed reviewers: + jochen@chromium.org
jochen: Would you take a look at this? Elliott is OOO. I already sent a PSA to blink-dev@: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/LnhBbz6z0Y8
lgtm
The CQ bit was checked by haraken@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Stop calling blink::shutdown RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discus...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence, this CL stops calling blink::shutdown, which had been shutting down *some things* in Blink and V8 gracefully. (Remember that blink::shutdown hadn't been shutting down everything; a lot of objects in Blink and V8 had already been left as is without getting destructed.) Ideally we should just call ProcessDied() at an earlier stage of RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL. BUG=639244 ========== to ========== Stop calling blink::shutdown RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discus...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence, this CL stops calling blink::shutdown, which had been shutting down *some things* in Blink and V8 gracefully. (Remember that blink::shutdown hadn't been shutting down everything; a lot of objects in Blink and V8 had already been left as is without getting destructed.) Ideally we should just call ProcessDied() at an earlier stage of RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL. BUG=639244 ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001)
Message was sent while issue was closed.
Description was changed from ========== Stop calling blink::shutdown RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discus...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence, this CL stops calling blink::shutdown, which had been shutting down *some things* in Blink and V8 gracefully. (Remember that blink::shutdown hadn't been shutting down everything; a lot of objects in Blink and V8 had already been left as is without getting destructed.) Ideally we should just call ProcessDied() at an earlier stage of RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL. BUG=639244 ========== to ========== Stop calling blink::shutdown RenderThreadImpl::Shutdown has been trying to shut down Blink and V8 gracefully, but the graceful shutdown has caused tons of use-after-free bugs (and many engineers has spent lots of time fixing ordering issues around the shutdown). As discussed in blink-dev@ (https://groups.google.com/a/chromium.org/d/topic/blink-dev/kk4VX0xRB7I/discus...) and platform-architecture-dev@ (https://groups.google.com/a/chromium.org/d/topic/platform-architecture-dev/Zc...), there is no reason we have to shut down the renderer gracefully. It's just causing use-after-free bugs and wasting performance. Hence, this CL stops calling blink::shutdown, which had been shutting down *some things* in Blink and V8 gracefully. (Remember that blink::shutdown hadn't been shutting down everything; a lot of objects in Blink and V8 had already been left as is without getting destructed.) Ideally we should just call ProcessDied() at an earlier stage of RenderThreadImpl::Shutdown(), but I'd like to defer the change to a separate CL. BUG=639244 Committed: https://crrev.com/01cb51d26b17923ed2b5c3b59566f0fc9aed74ae Cr-Commit-Position: refs/heads/master@{#413430} ==========
Message was sent while issue was closed.
Patchset 9 (id:??) landed as https://crrev.com/01cb51d26b17923ed2b5c3b59566f0fc9aed74ae Cr-Commit-Position: refs/heads/master@{#413430}
Message was sent while issue was closed.
A revert of this CL (patchset #9 id:160001) has been created in https://codereview.chromium.org/2312593002/ by haraken@chromium.org. The reason for reverting is: I'll revert r413430 and its dependent CLs because r413430 caused issue 642072. The problem is that we cannot simply remove blink::shutdown because the following scenario can happen: 1) blink::shutdown is not called. Workers are still running. 2) RenderThreadImpl gets destructed. MessageLoop gets destructed. 3) The workers may access the RenderThreadImpl and MessageLoop. To fix the problem, we need to call ProcessDied() and forcibly kill the renderer process at the end of RenderThreadImpl::Shutdown().. |