Issue 583043005: Pending tasks in a message loop should be deleted before shutting down Blink

Issue 583043005: Pending tasks in a message loop should be deleted before shutting down Blink (Closed)

Created:
6 years, 3 months ago by haraken

Modified:
6 years, 1 month ago

Reviewers:
jamesr, jochen (gone - plz use gerrit), jar (doing other things), Mads Ager (chromium), Ken Russell (switch to Gerrit), tkent

CC:
chromium-reviews, mkwst+moarreviews-renderer_chromium.org, jam, erikwright+watch_chromium.org, sadrul, darin-cc_chromium.org

Base URL:
https://chromium.googlesource.com/chromium/src.git@master

Project:
chromium

Visibility:
Public.

More Reviews

Description

Pending tasks in a message loop should be deleted before shutting down Blink Currently Blink is shut down before all the pending tasks in the message loop are deleted. This is problematic in Oilpan because a destructor of the pending tasks can touch Oilpan objects. Because Oilpan is already detached from the renderer thread at that point, touching Oilpan objects in the destructor leads to a crash. (See the bug report for a concrete scenario.) To prevent Blink objects from getting accessed after Blink is shut down, this CL deletes all pending tasks in a message loop before shutting down Blink. BUG=411026 TEST=None. I cannot reproduce the crash. Committed: https://crrev.com/fdd5612c20f777e1279efd7c1e99d82ed04afaaf Cr-Commit-Position: refs/heads/master@{#296697} Committed: https://crrev.com/16d32a9f7f6d1ebb639cacedb5156272a9fec764 Cr-Commit-Position: refs/heads/master@{#297338} Committed: https://crrev.com/53f081de05b86f73eca4e383a16c8dc723b78a99 Cr-Commit-Position: refs/heads/master@{#303557}

Patch Set 1 #

Total comments: 3

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : #

Total comments: 1

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : #

Patch Set 8 : #

Total comments: 2

Patch Set 9 : #

Patch Set 10 : #

Total comments: 1

Patch Set 11 : #

Patch Set 12 : #

Patch Set 13 : #

Created: 6 years, 1 month ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+44 lines, -13 lines)			Patch
M	content/common/gpu/client/gpu_channel_host.h	View	1 2 3 4 5 6 7 8 9 10 11	1 chunk	+3 lines, -0 lines	0 comments	Download
M	content/common/gpu/client/gpu_channel_host.cc	View	1 2 3 4 5 6 7 8 9 10 11	2 chunks	+8 lines, -4 lines	0 comments	Download
M	content/renderer/render_thread_impl.h	View	1 2 3 4 5 6 7 8 9 10 11	2 chunks	+7 lines, -0 lines	0 comments	Download
M	content/renderer/render_thread_impl.cc	View	1 2 3 4 5 6 7 8 9 10 11 12	2 chunks	+20 lines, -4 lines	0 comments	Download
M	content/renderer/renderer_main.cc	View	1 2 3 4 5 6 7 8 9 10 11	3 chunks	+6 lines, -5 lines	0 comments	Download

Messages

Total messages: 64 (8 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

Mads Ager (chromium)

LGTM Oilpan exposes this issue but it seems bad in general that task destructors can ...

6 years, 3 months ago (2014-09-22 06:35:22 UTC) #3

haraken

jochen@: would you take a look at this when you have a cycle?

6 years, 3 months ago (2014-09-22 06:39:09 UTC) #4

jochen (gone - plz use gerrit)

i'm not sure this is the right approach. which tasks are touching oilpan objects? Can ...

6 years, 3 months ago (2014-09-22 06:44:45 UTC) #5

Mads Ager (chromium)

On 2014/09/22 at 06:44:45, jochen wrote: > i'm not sure this is the right approach. ...

6 years, 3 months ago (2014-09-22 06:59:09 UTC) #6

haraken

> i'm not sure this is the right approach. > > which tasks are touching ...

6 years, 3 months ago (2014-09-22 07:11:40 UTC) #7

jamesr

Can we just quit the message loop completely before deinitializing Blink? What tasks in the ...

6 years, 3 months ago (2014-09-22 14:34:47 UTC) #10

jamesr

https://codereview.chromium.org/583043005/diff/1/base/message_loop/message_loop.h File base/message_loop/message_loop.h (right): https://codereview.chromium.org/583043005/diff/1/base/message_loop/message_loop.h#newcode377 base/message_loop/message_loop.h:377: bool DeletePendingTasks(); this doesn't make sense. ~MessageLoop() does the ...

6 years, 3 months ago (2014-09-22 14:38:04 UTC) #11

haraken

Thanks James for review! https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_main.cc File content/renderer/renderer_main.cc (right): https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_main.cc#newcode238 content/renderer/renderer_main.cc:238: main_message_loop.DeletePendingTasks(); On 2014/09/22 14:38:04, jamesr ...

6 years, 3 months ago (2014-09-22 15:54:18 UTC) #12

jamesr

On 2014/09/22 15:54:18, haraken wrote: > Thanks James for review! > > https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_main.cc > File ...

6 years, 3 months ago (2014-09-22 16:40:33 UTC) #13

haraken

On 2014/09/22 16:40:33, jamesr wrote: > On 2014/09/22 15:54:18, haraken wrote: > > Thanks James ...

6 years, 3 months ago (2014-09-24 01:24:04 UTC) #14

On 2014/09/22 16:40:33, jamesr wrote:
> On 2014/09/22 15:54:18, haraken wrote:
> > Thanks James for review!
> > 
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > File content/renderer/renderer_main.cc (right):
> > 
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > content/renderer/renderer_main.cc:238:
main_message_loop.DeletePendingTasks();
> > On 2014/09/22 14:38:04, jamesr wrote:
> > > since you're deleting *all* pending tasks and not just Blink ones then you
> > > definitely do not care about anything else running, so not lgtm.  You
should
> > > simply shut down the loop before shutting down blink
> > 
> > I'm not familiar with a message loop, but what's a common way to completely
> shut
> > down the main_message_loop here?
> > 
> > At first I tried something like:
> > 
> > scoped_ptr<base::MessageLoop> main_message_loop(new base::MessageLoop()); 
//
> > line 155
> > {
> >   ...;
> >   main_message_loop.reset();  // line 238
> > }
> > 
> > but it crashed chrome :-/
> 
> 
> What is the crash stack?  If I run code like this standalone it works fine.
> 
> If somebody is trying to access the message loop after you are destroying it
> here to post a task then this patch won't fix the bug anyway, since this newly
> posted task will still be destroyed after blink is shut down.  You'll need to
> figure out who is accessing the message loop when and why.

I uploaded the change to a patch set 2. This crashes with the following call
stack.

The problem is that RenderProcessImpl's destructor posts a task to the message
loop and thus the message loop needs to be kept alive in the RenderProcessImpl's
destructor. Thus we cannot destruct the message loop before destructing the
RenderProcessImpl's destructor.

That being said, this is a weird situation. Probably should we restructure the
code so that the RenderProcessImpl's destructor doesn't post any task to the
message loop? Any advice is welcome :)


SUMMARY: AddressSanitizer: SEGV ??:0 ??
==9==ABORTING
[36:36:0924/101619:FATAL:thread_task_runner_handle.cc(23)] Check failed:
current.
#0 0x7f2961f1ea5f __interceptor_backtrace
#1 0x7f296bc49384 base::debug::StackTrace::StackTrace()
#2 0x7f296c09c6f2 logging::LogMessage::~LogMessage()
#3 0x7f296c70bfab base::ThreadTaskRunnerHandle::Get()
#4 0x7f296c82a847 base::Timer::PostNewScheduledTask()
#5 0x7f296c829702 base::Timer::Reset()
#6 0x7f296c828b1a base::Timer::Start()
#7 0x7f29a423b0ab base::BaseTimerMethodPointer<>::Start()
#8 0x7f29a421d1b6 content::BlinkPlatformImpl::setSharedTimerFireInterval()
#9 0x7f297cc234df blink::Scheduler::setSharedTimerFireInterval()
#10 0x7f297cc3120f blink::setSharedTimerFireInterval()
#11 0x7f297cbdbdd4 blink::MainThreadSharedTimer::setFireInterval()
#12 0x7f297cbd7e1a blink::ThreadTimers::updateSharedTimer()
#13 0x7f297cb1e668 blink::TimerBase::setNextFireTime()
#14 0x7f297cb1dbac blink::TimerBase::start()
#15 0x7f297c8215ca blink::TimerBase::startOneShot()
#16 0x7f2981e09984 blink::FrameLoader::scheduleCheckCompleted()
#17 0x7f2981e08bf3 blink::FrameLoader::setDefersLoading()
#18 0x7f298201e100 blink::Page::setDefersLoading()
#19 0x7f29820ba8ea blink::ScopedPageLoadDeferrer::detach()
#20 0x7f29820bacbf blink::ScopedPageLoadDeferrer::~ScopedPageLoadDeferrer()
#21 0x7f297e495263 blink::WebView::didExitModalLoop()
#22 0x7f29a4cd7661 content::RenderThreadImpl::Shutdown()
#23 0x7f29a4cd7f51 content::RenderThreadImpl::Shutdown()
#24 0x7f29a45cf571 content::ChildProcess::~ChildProcess()
#25 0x7f29a4cb34ca content::RenderProcess::~RenderProcess()
#26 0x7f29a4cb2ab4 content::RenderProcessImpl::~RenderProcessImpl()
#27 0x7f29a4ff8105 content::RendererMain()
#28 0x7f296baafb93 content::RunZygote()
#29 0x7f296bab14ff content::RunNamedProcessTypeMain()
#30 0x7f296bac16e0 content::ContentMainRunnerImpl::Run()
#31 0x7f296baac995 content::ContentMain()
#32 0x7f2961f7ceba ChromeMain
#33 0x7f2961f7cabe main
#34 0x7f29557ba76d __libc_start_main
#35 0x7f2961f7c7ad <unknown>

haraken

On 2014/09/24 01:24:04, haraken wrote: > On 2014/09/22 16:40:33, jamesr wrote: > > On 2014/09/22 ...

6 years, 3 months ago (2014-09-24 02:09:40 UTC) #15

On 2014/09/24 01:24:04, haraken wrote:
> On 2014/09/22 16:40:33, jamesr wrote:
> > On 2014/09/22 15:54:18, haraken wrote:
> > > Thanks James for review!
> > > 
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > File content/renderer/renderer_main.cc (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > content/renderer/renderer_main.cc:238:
> main_message_loop.DeletePendingTasks();
> > > On 2014/09/22 14:38:04, jamesr wrote:
> > > > since you're deleting *all* pending tasks and not just Blink ones then
you
> > > > definitely do not care about anything else running, so not lgtm.  You
> should
> > > > simply shut down the loop before shutting down blink
> > > 
> > > I'm not familiar with a message loop, but what's a common way to
completely
> > shut
> > > down the main_message_loop here?
> > > 
> > > At first I tried something like:
> > > 
> > > scoped_ptr<base::MessageLoop> main_message_loop(new base::MessageLoop()); 
> //
> > > line 155
> > > {
> > >   ...;
> > >   main_message_loop.reset();  // line 238
> > > }
> > > 
> > > but it crashed chrome :-/
> > 
> > 
> > What is the crash stack?  If I run code like this standalone it works fine.
> > 
> > If somebody is trying to access the message loop after you are destroying it
> > here to post a task then this patch won't fix the bug anyway, since this
newly
> > posted task will still be destroyed after blink is shut down.  You'll need
to
> > figure out who is accessing the message loop when and why.
> 
> I uploaded the change to a patch set 2. This crashes with the following call
> stack.
> 
> The problem is that RenderProcessImpl's destructor posts a task to the message
> loop and thus the message loop needs to be kept alive in the
RenderProcessImpl's
> destructor. Thus we cannot destruct the message loop before destructing the
> RenderProcessImpl's destructor.
> 
> That being said, this is a weird situation. Probably should we restructure the
> code so that the RenderProcessImpl's destructor doesn't post any task to the
> message loop? Any advice is welcome :)

At a first glance, it seems not that easy to make the RenderProcessImpl's
destructor not post any task to the message loop. Probably what we can do is:

1) Flush all pending tasks (which may touch Blink objects) in the message loop.
2) Destruct the RenderProcessImpl's destructor.
3) Allow the RenderProcessImpl's destructor to post a task to the message loop
but the task shouldn't touch Blink objects of course.
4) Destruct the message loop.

This is something the patch set 1 is doing.

I'd like to hear your thoughts.

jamesr

On 2014/09/24 02:09:40, haraken wrote: > On 2014/09/24 01:24:04, haraken wrote: > > On 2014/09/22 ...

6 years, 3 months ago (2014-09-24 02:47:09 UTC) #16

On 2014/09/24 02:09:40, haraken wrote:
> On 2014/09/24 01:24:04, haraken wrote:
> > On 2014/09/22 16:40:33, jamesr wrote:
> > > On 2014/09/22 15:54:18, haraken wrote:
> > > > Thanks James for review!
> > > > 
> > > >
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > > File content/renderer/renderer_main.cc (right):
> > > > 
> > > >
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > > content/renderer/renderer_main.cc:238:
> > main_message_loop.DeletePendingTasks();
> > > > On 2014/09/22 14:38:04, jamesr wrote:
> > > > > since you're deleting *all* pending tasks and not just Blink ones then
> you
> > > > > definitely do not care about anything else running, so not lgtm.  You
> > should
> > > > > simply shut down the loop before shutting down blink
> > > > 
> > > > I'm not familiar with a message loop, but what's a common way to
> completely
> > > shut
> > > > down the main_message_loop here?
> > > > 
> > > > At first I tried something like:
> > > > 
> > > > scoped_ptr<base::MessageLoop> main_message_loop(new
base::MessageLoop()); 
> > //
> > > > line 155
> > > > {
> > > >   ...;
> > > >   main_message_loop.reset();  // line 238
> > > > }
> > > > 
> > > > but it crashed chrome :-/
> > > 
> > > 
> > > What is the crash stack?  If I run code like this standalone it works
fine.
> > > 
> > > If somebody is trying to access the message loop after you are destroying
it
> > > here to post a task then this patch won't fix the bug anyway, since this
> newly
> > > posted task will still be destroyed after blink is shut down.  You'll need
> to
> > > figure out who is accessing the message loop when and why.
> > 
> > I uploaded the change to a patch set 2. This crashes with the following call
> > stack.
> > 
> > The problem is that RenderProcessImpl's destructor posts a task to the
message
> > loop and thus the message loop needs to be kept alive in the
> RenderProcessImpl's
> > destructor. Thus we cannot destruct the message loop before destructing the
> > RenderProcessImpl's destructor.
> > 
> > That being said, this is a weird situation. Probably should we restructure
the
> > code so that the RenderProcessImpl's destructor doesn't post any task to the
> > message loop? Any advice is welcome :)
> 
> At a first glance, it seems not that easy to make the RenderProcessImpl's
> destructor not post any task to the message loop. Probably what we can do is:
> 
> 1) Flush all pending tasks (which may touch Blink objects) in the message
loop.
> 2) Destruct the RenderProcessImpl's destructor.
> 3) Allow the RenderProcessImpl's destructor to post a task to the message loop
> but the task shouldn't touch Blink objects of course.
> 4) Destruct the message loop.
> 
> This is something the patch set 1 is doing.
> 
> I'd like to hear your thoughts.

The stack you posted above is a task that *is* touching blink objects, so this
proposal is definitely not going to work.

jamesr

The blink shutdown call is here: https://code.google.com/p/chromium/codesearch#chromium/src/content/renderer/render_thread_impl.cc&q=didExitModalLoop&sq=package:chromium&type=cs&l=591 which is after the code you posted that ...

6 years, 3 months ago (2014-09-24 02:49:23 UTC) #17

haraken

On 2014/09/24 02:47:09, jamesr wrote: > On 2014/09/24 02:09:40, haraken wrote: > > On 2014/09/24 ...

6 years, 3 months ago (2014-09-24 02:51:16 UTC) #18

On 2014/09/24 02:47:09, jamesr wrote:
> On 2014/09/24 02:09:40, haraken wrote:
> > On 2014/09/24 01:24:04, haraken wrote:
> > > On 2014/09/22 16:40:33, jamesr wrote:
> > > > On 2014/09/22 15:54:18, haraken wrote:
> > > > > Thanks James for review!
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > > > File content/renderer/renderer_main.cc (right):
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > > > content/renderer/renderer_main.cc:238:
> > > main_message_loop.DeletePendingTasks();
> > > > > On 2014/09/22 14:38:04, jamesr wrote:
> > > > > > since you're deleting *all* pending tasks and not just Blink ones
then
> > you
> > > > > > definitely do not care about anything else running, so not lgtm. 
You
> > > should
> > > > > > simply shut down the loop before shutting down blink
> > > > > 
> > > > > I'm not familiar with a message loop, but what's a common way to
> > completely
> > > > shut
> > > > > down the main_message_loop here?
> > > > > 
> > > > > At first I tried something like:
> > > > > 
> > > > > scoped_ptr<base::MessageLoop> main_message_loop(new
> base::MessageLoop()); 
> > > //
> > > > > line 155
> > > > > {
> > > > >   ...;
> > > > >   main_message_loop.reset();  // line 238
> > > > > }
> > > > > 
> > > > > but it crashed chrome :-/
> > > > 
> > > > 
> > > > What is the crash stack?  If I run code like this standalone it works
> fine.
> > > > 
> > > > If somebody is trying to access the message loop after you are
destroying
> it
> > > > here to post a task then this patch won't fix the bug anyway, since this
> > newly
> > > > posted task will still be destroyed after blink is shut down.  You'll
need
> > to
> > > > figure out who is accessing the message loop when and why.
> > > 
> > > I uploaded the change to a patch set 2. This crashes with the following
call
> > > stack.
> > > 
> > > The problem is that RenderProcessImpl's destructor posts a task to the
> message
> > > loop and thus the message loop needs to be kept alive in the
> > RenderProcessImpl's
> > > destructor. Thus we cannot destruct the message loop before destructing
the
> > > RenderProcessImpl's destructor.
> > > 
> > > That being said, this is a weird situation. Probably should we restructure
> the
> > > code so that the RenderProcessImpl's destructor doesn't post any task to
the
> > > message loop? Any advice is welcome :)
> > 
> > At a first glance, it seems not that easy to make the RenderProcessImpl's
> > destructor not post any task to the message loop. Probably what we can do
is:
> > 
> > 1) Flush all pending tasks (which may touch Blink objects) in the message
> loop.
> > 2) Destruct the RenderProcessImpl's destructor.
> > 3) Allow the RenderProcessImpl's destructor to post a task to the message
loop
> > but the task shouldn't touch Blink objects of course.
> > 4) Destruct the message loop.
> > 
> > This is something the patch set 1 is doing.
> > 
> > I'd like to hear your thoughts.
> 
> The stack you posted above is a task that *is* touching blink objects, so this
> proposal is definitely not going to work.

It will work in practice because the task is not touching *Oilpaned* Blink
objects (the problem happens only if Oilpaned Blink objects are touched after
the main thread is detached from Oilpan's GC), but I agree that's a fragile
assumption.

Do you think that removing all the tasks from the RenderProcessImpl's destructor
is a right way to go?

jamesr

On 2014/09/24 02:51:16, haraken wrote: > It will work in practice because the task is ...

6 years, 3 months ago (2014-09-24 02:53:22 UTC) #19

jamesr

On 2014/09/24 01:24:04, haraken wrote: > On 2014/09/22 16:40:33, jamesr wrote: > > On 2014/09/22 ...

6 years, 3 months ago (2014-09-24 02:56:25 UTC) #20

On 2014/09/24 01:24:04, haraken wrote:
> On 2014/09/22 16:40:33, jamesr wrote:
> > On 2014/09/22 15:54:18, haraken wrote:
> > > Thanks James for review!
> > > 
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > File content/renderer/renderer_main.cc (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/583043005/diff/1/content/renderer/renderer_ma...
> > > content/renderer/renderer_main.cc:238:
> main_message_loop.DeletePendingTasks();
> > > On 2014/09/22 14:38:04, jamesr wrote:
> > > > since you're deleting *all* pending tasks and not just Blink ones then
you
> > > > definitely do not care about anything else running, so not lgtm.  You
> should
> > > > simply shut down the loop before shutting down blink
> > > 
> > > I'm not familiar with a message loop, but what's a common way to
completely
> > shut
> > > down the main_message_loop here?
> > > 
> > > At first I tried something like:
> > > 
> > > scoped_ptr<base::MessageLoop> main_message_loop(new base::MessageLoop()); 
> //
> > > line 155
> > > {
> > >   ...;
> > >   main_message_loop.reset();  // line 238
> > > }
> > > 
> > > but it crashed chrome :-/
> > 
> > 
> > What is the crash stack?  If I run code like this standalone it works fine.
> > 
> > If somebody is trying to access the message loop after you are destroying it
> > here to post a task then this patch won't fix the bug anyway, since this
newly
> > posted task will still be destroyed after blink is shut down.  You'll need
to
> > figure out who is accessing the message loop when and why.
> 
> I uploaded the change to a patch set 2. This crashes with the following call
> stack.
> 
> The problem is that RenderProcessImpl's destructor posts a task to the message
> loop and thus the message loop needs to be kept alive in the
RenderProcessImpl's
> destructor. Thus we cannot destruct the message loop before destructing the
> RenderProcessImpl's destructor.
> 
> That being said, this is a weird situation. Probably should we restructure the
> code so that the RenderProcessImpl's destructor doesn't post any task to the
> message loop? Any advice is welcome :)
> 
> 
> SUMMARY: AddressSanitizer: SEGV ??:0 ??
> ==9==ABORTING
> [36:36:0924/101619:FATAL:thread_task_runner_handle.cc(23)] Check failed:
> current.
> #0 0x7f2961f1ea5f __interceptor_backtrace
> #1 0x7f296bc49384 base::debug::StackTrace::StackTrace()
> #2 0x7f296c09c6f2 logging::LogMessage::~LogMessage()
> #3 0x7f296c70bfab base::ThreadTaskRunnerHandle::Get()
> #4 0x7f296c82a847 base::Timer::PostNewScheduledTask()
> #5 0x7f296c829702 base::Timer::Reset()
> #6 0x7f296c828b1a base::Timer::Start()
> #7 0x7f29a423b0ab base::BaseTimerMethodPointer<>::Start()
> #8 0x7f29a421d1b6 content::BlinkPlatformImpl::setSharedTimerFireInterval()
> #9 0x7f297cc234df blink::Scheduler::setSharedTimerFireInterval()
> #10 0x7f297cc3120f blink::setSharedTimerFireInterval()
> #11 0x7f297cbdbdd4 blink::MainThreadSharedTimer::setFireInterval()
> #12 0x7f297cbd7e1a blink::ThreadTimers::updateSharedTimer()
> #13 0x7f297cb1e668 blink::TimerBase::setNextFireTime()
> #14 0x7f297cb1dbac blink::TimerBase::start()
> #15 0x7f297c8215ca blink::TimerBase::startOneShot()
> #16 0x7f2981e09984 blink::FrameLoader::scheduleCheckCompleted()
> #17 0x7f2981e08bf3 blink::FrameLoader::setDefersLoading()
> #18 0x7f298201e100 blink::Page::setDefersLoading()
> #19 0x7f29820ba8ea blink::ScopedPageLoadDeferrer::detach()
> #20 0x7f29820bacbf blink::ScopedPageLoadDeferrer::~ScopedPageLoadDeferrer()
> #21 0x7f297e495263 blink::WebView::didExitModalLoop()
> #22 0x7f29a4cd7661 content::RenderThreadImpl::Shutdown()
> #23 0x7f29a4cd7f51 content::RenderThreadImpl::Shutdown()
> #24 0x7f29a45cf571 content::ChildProcess::~ChildProcess()
> #25 0x7f29a4cb34ca content::RenderProcess::~RenderProcess()
> #26 0x7f29a4cb2ab4 content::RenderProcessImpl::~RenderProcessImpl()

What if you just keep the loop around until ~RenderProcessImpl() completes? 
I.e. scope the RenderProcessImpl more tightly than the message loop.

> #27 0x7f29a4ff8105 content::RendererMain()
> #28 0x7f296baafb93 content::RunZygote()
> #29 0x7f296bab14ff content::RunNamedProcessTypeMain()
> #30 0x7f296bac16e0 content::ContentMainRunnerImpl::Run()
> #31 0x7f296baac995 content::ContentMain()
> #32 0x7f2961f7ceba ChromeMain
> #33 0x7f2961f7cabe main
> #34 0x7f29557ba76d __libc_start_main
> #35 0x7f2961f7c7ad <unknown>

jar (doing other things)

I'm not sure I'm fully following your proposal... but in general, it would be better ...

6 years, 3 months ago (2014-09-24 03:16:28 UTC) #21

I'm not sure I'm fully following your proposal... but in general, it would
be better to not have the message loop destruction sequence tied so tightly
to a single specific user (RenderProcessImpl).

It is plausible that the message loop destruction should be "staged," but
it would IMO be bad to cater too tightly to a single user, as there may be
other users with different restrictions, and then it would be hard to
untangle the semantics for each user.

Much better would be to fashion users to either be able to to deal with
task processing ending, or handle destruction without running trailing
tasks.

It may be valuable to think about some standard staging of the message loop
shutdown, and a lot of OSs have such systems that we can look to for
reasonable models.  We'd need to have an API where interested users could
get notifications... a way to indicate that they are OK with moving on to
the "next" stage.  <sigh>... we'd have to think through avoiding
deadlock... and probably have *some* timers to force things when there was
no agreement from users... but it could be done. </sigh>

Examples of stages might be:
a) Going down RSN (posted time, unless all observers say OK?)
b) Future task posts won't be run, but will be destroyed.
c) Future task posts won't be run or destroyed (they will leak).
d) Message loop is destroyed, and continued use (posting) will potentially
crash.

There *might* be some finer granularity... but the above might work for
most uses.

Most users should try to avoid involvement with the above.  Risks of
deadlocks, and confusion, are IMO too great... so you need a really good
reason to go there.

On Tue, Sep 23, 2014 at 7:09 PM, <haraken@chromium.org> wrote:

> On 2014/09/24 01:24:04, haraken wrote:
>
>> On 2014/09/22 16:40:33, jamesr wrote:
>> > On 2014/09/22 15:54:18, haraken wrote:
>> > > Thanks James for review!
>> > >
>> > >
>> >
>>
>
> https://codereview.chromium.org/583043005/diff/1/content/
> renderer/renderer_main.cc
>
>> > > File content/renderer/renderer_main.cc (right):
>> > >
>> > >
>> >
>>
>
> https://codereview.chromium.org/583043005/diff/1/content/
> renderer/renderer_main.cc#newcode238
>
>> > > content/renderer/renderer_main.cc:238:
>> main_message_loop.DeletePendingTasks();
>> > > On 2014/09/22 14:38:04, jamesr wrote:
>> > > > since you're deleting *all* pending tasks and not just Blink ones
>> then
>>
> you
>
>> > > > definitely do not care about anything else running, so not lgtm.
>> You
>> should
>> > > > simply shut down the loop before shutting down blink
>> > >
>> > > I'm not familiar with a message loop, but what's a common way to
>>
> completely
>
>> > shut
>> > > down the main_message_loop here?
>> > >
>> > > At first I tried something like:
>> > >
>> > > scoped_ptr<base::MessageLoop> main_message_loop(new
>> base::MessageLoop());
>> //
>> > > line 155
>> > > {
>> > >   ...;
>> > >   main_message_loop.reset();  // line 238
>> > > }
>> > >
>> > > but it crashed chrome :-/
>> >
>> >
>> > What is the crash stack?  If I run code like this standalone it works
>> fine.
>> >
>> > If somebody is trying to access the message loop after you are
>> destroying it
>> > here to post a task then this patch won't fix the bug anyway, since this
>>
> newly
>
>> > posted task will still be destroyed after blink is shut down.  You'll
>> need
>>
> to
>
>> > figure out who is accessing the message loop when and why.
>>
>
>  I uploaded the change to a patch set 2. This crashes with the following
>> call
>> stack.
>>
>
>  The problem is that RenderProcessImpl's destructor posts a task to the
>> message
>> loop and thus the message loop needs to be kept alive in the
>>
> RenderProcessImpl's
>
>> destructor. Thus we cannot destruct the message loop before destructing
>> the
>> RenderProcessImpl's destructor.
>>
>
>  That being said, this is a weird situation. Probably should we
>> restructure the
>> code so that the RenderProcessImpl's destructor doesn't post any task to
>> the
>> message loop? Any advice is welcome :)
>>
>
> At a first glance, it seems not that easy to make the RenderProcessImpl's
> destructor not post any task to the message loop. Probably what we can do
> is:
>
> 1) Flush all pending tasks (which may touch Blink objects) in the message
> loop.
> 2) Destruct the RenderProcessImpl's destructor.
> 3) Allow the RenderProcessImpl's destructor to post a task to the message
> loop
> but the task shouldn't touch Blink objects of course.
> 4) Destruct the message loop.
>
> This is something the patch set 1 is doing.
>
> I'd like to hear your thoughts.
>
>
> https://codereview.chromium.org/583043005/
>

To unsubscribe from this group and stop receiving emails from it, send an email
to chromium-reviews+unsubscribe@chromium.org.

jamesr

Looking more closely you need to shut the loop down at a specific point in ...

6 years, 3 months ago (2014-09-24 03:40:09 UTC) #22

haraken

On 2014/09/24 03:40:09, jamesr wrote: > Looking more closely you need to shut the loop ...

6 years, 3 months ago (2014-09-24 03:52:07 UTC) #23

haraken

I tried to completely shut down (i.e., destruct) the message loop just before RenderThreadImpl::Shutdown() calls ...

6 years, 3 months ago (2014-09-24 04:53:29 UTC) #25

haraken

On 2014/09/24 04:53:29, haraken wrote: > I tried to completely shut down (i.e., destruct) the ...

6 years, 3 months ago (2014-09-24 04:53:42 UTC) #26

jamesr

This is still risky because if blink::shutdown() calls any code that creates a task or ...

6 years, 3 months ago (2014-09-24 04:57:26 UTC) #27

haraken

On 2014/09/24 04:57:26, jamesr wrote: > This is still risky because if blink::shutdown() calls any ...

6 years, 3 months ago (2014-09-24 05:40:51 UTC) #28

jamesr

Adding MessageLoop::Shutdown() introduces a new awkward state for base::MessageLoop where it's still around but is ...

6 years, 3 months ago (2014-09-24 05:46:10 UTC) #29

haraken

On 2014/09/24 05:46:10, jamesr wrote: > Adding MessageLoop::Shutdown() introduces a new awkward state for > ...

6 years, 3 months ago (2014-09-24 07:24:43 UTC) #30

jamesr

The lack of trybots makes me nervous, but the code change lgtm as far as ...

6 years, 2 months ago (2014-09-25 06:03:21 UTC) #32

haraken

On 2014/09/25 06:03:21, jamesr wrote: > The lack of trybots makes me nervous, but the ...

6 years, 2 months ago (2014-09-25 06:24:53 UTC) #33

haraken

On 2014/09/25 06:24:53, haraken wrote: > On 2014/09/25 06:03:21, jamesr wrote: > > The lack ...

6 years, 2 months ago (2014-09-25 11:26:59 UTC) #34

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/583043005/120001

6 years, 2 months ago (2014-09-25 11:28:58 UTC) #36

commit-bot: I haz the power

Committed patchset #7 (id:120001) as 1fb8091790b5065fa77956fbdc2f5ce5b539494b

6 years, 2 months ago (2014-09-25 12:24:37 UTC) #37

commit-bot: I haz the power

Patchset 7 (id:??) landed as https://crrev.com/fdd5612c20f777e1279efd7c1e99d82ed04afaaf Cr-Commit-Position: refs/heads/master@{#296697}

6 years, 2 months ago (2014-09-25 12:25:11 UTC) #38

reveman

A revert of this CL (patchset #7 id:120001) has been created in https://codereview.chromium.org/608043002/ by reveman@chromium.org. ...

6 years, 2 months ago (2014-09-26 20:24:59 UTC) #39

Zhenyao Mo

On 2014/09/26 20:24:59, reveman wrote: > A revert of this CL (patchset #7 id:120001) has ...

6 years, 2 months ago (2014-09-26 20:58:19 UTC) #40

haraken

jamesr@: PTAL again. The previous patch caused crashes in ASAN bots because GpuChannelHost is destructed ...

6 years, 2 months ago (2014-09-29 05:24:45 UTC) #41

jamesr

https://codereview.chromium.org/583043005/diff/140001/content/renderer/render_thread_impl.cc File content/renderer/render_thread_impl.cc (right): https://codereview.chromium.org/583043005/diff/140001/content/renderer/render_thread_impl.cc#newcode681 content/renderer/render_thread_impl.cc:681: NPChannelBase::CleanupChannels(); could this cause problems? it appears to call ...

6 years, 2 months ago (2014-09-29 19:03:56 UTC) #42

haraken

jamesr@: Thanks for review. PTAL. https://codereview.chromium.org/583043005/diff/140001/content/renderer/render_thread_impl.cc File content/renderer/render_thread_impl.cc (right): https://codereview.chromium.org/583043005/diff/140001/content/renderer/render_thread_impl.cc#newcode681 content/renderer/render_thread_impl.cc:681: NPChannelBase::CleanupChannels(); On 2014/09/29 19:03:56, ...

6 years, 2 months ago (2014-09-30 01:08:19 UTC) #43

jamesr

OK. I wasn't totally sure about that either, but this seems safer

6 years, 2 months ago (2014-09-30 01:09:54 UTC) #44

haraken

On 2014/09/30 01:09:54, jamesr wrote: > OK. I wasn't totally sure about that either, but ...

6 years, 2 months ago (2014-09-30 01:10:28 UTC) #45

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/583043005/160001

6 years, 2 months ago (2014-09-30 01:11:36 UTC) #48

commit-bot: I haz the power

Committed patchset #9 (id:160001) as de8c98a7bfd02b8022da6577d3e4112b15c34ba2

6 years, 2 months ago (2014-09-30 01:56:20 UTC) #49

commit-bot: I haz the power

Patchset 9 (id:??) landed as https://crrev.com/16d32a9f7f6d1ebb639cacedb5156272a9fec764 Cr-Commit-Position: refs/heads/master@{#297338}

6 years, 2 months ago (2014-09-30 01:56:50 UTC) #50

kareng

A revert of this CL (patchset #9 id:160001) has been created in https://codereview.chromium.org/614233002/ by kareng@google.com. ...

6 years, 2 months ago (2014-09-30 21:33:05 UTC) #51

haraken

jamesr@: Please take another look. kareng@: Sorry about the repeated breakage. Actually it's hard to ...

6 years, 2 months ago (2014-10-03 04:28:13 UTC) #52

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/583043005/240001

6 years, 1 month ago (2014-11-10 07:41:48 UTC) #55

commit-bot: I haz the power

Try jobs failed on following builders: chromium_presubmit on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presubmit/builds/23151)

6 years, 1 month ago (2014-11-10 07:45:04 UTC) #57

haraken

kbr@: Would you take a look at the change in content/common/gpu/ ?

6 years, 1 month ago (2014-11-10 08:02:18 UTC) #59

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/583043005/240001

6 years, 1 month ago (2014-11-11 01:01:11 UTC) #62

commit-bot: I haz the power

6 years, 1 month ago (2014-11-11 01:04:41 UTC) #64

Message was sent while issue was closed.

Patchset 13 (id:??) landed as
https://crrev.com/53f081de05b86f73eca4e383a16c8dc723b78a99
Cr-Commit-Position: refs/heads/master@{#303557}

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages