Issue 484007: Mac: for branded Google Chrome, handle SIGTERM in renderer to clean up Breakpad.

Issue 484007: Mac: for branded Google Chrome, handle SIGTERM in renderer to clean up Breakpad. (Closed)

Created:
11 years ago by viettrungluu

Modified:
9 years, 6 months ago

Reviewers:
Avi (use Gerrit), John Grabowski, TVL, jeremy, Mark Mentovai, Nico, nealsid

CC:
chromium-reviews_googlegroups.com, John Grabowski, pam+watch_chromium.org, ben+cc_chromium.org

Visibility:
Public.

Description

Mac: for branded Google Chrome, handle SIGTERM in renderer to clean up Breakpad. Test: 1. Build a (branded) Google Chrome. 2. From the command-line, run: "launchctl bslist | wc -l" and note the number. 3. Start Google Chrome on a fresh profile. 4. Once a browser window has opened, do #2 again, and note the number -- it should have increased (by 2). 5. Close the tab, and do #2 again. The number should have decreased by 1. 6. Open tabs, navigate, etc., and do #2 between each stage. Whenever a new (e.g., renderer) process is started, the number should increase (e.g., when you go to a "new" site). When a process is terminated (e.g., you close the last tab for a particular site), the number should decrease. 7. Quit, and do #2 again. Hopefully, the number should be the same as the first time. (Note: when a renderer crashes, the number will not decrease -- so the number after quitting will not be the same as before starting. This is why this is only a stopgap. For comparison, do this on a branded Google Chrome without this patch; typically, the numbers will not decrease on renderer termination.) BUG=28547 TEST=See above. Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=34318

Patch Set 1 #

Total comments: 1

Patch Set 2 : Added missing _exit(). #

Patch Set 3 : Moved call to DestructCrashReporter() outside signal handler (into thread). #

Total comments: 1

Patch Set 4 : Updated per Mark's comment. #

Total comments: 14

Patch Set 5 : Updated per Mark's review. #

Patch Set 6 : Oops, I wasn't done. #

Patch Set 7 : foo #

Total comments: 8

Patch Set 8 : Updated per Mark's re-review. #

Created: 11 years ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+137 lines, -0 lines)			Patch
	M	chrome/renderer/renderer_main.cc	View	1 2 3 4 5 6 7	2 chunks	+137 lines, -0 lines	0 comments	Download

Messages

Total messages: 34 (0 generated)

Expand Messages | Collapse Messages

viettrungluu

Just sending this out to a bunch of people who may be around soon-ish and ...

11 years ago (2009-12-10 04:13:06 UTC) #1

viettrungluu

Oops. Meant to include Avi, who did all the important legwork in tracking the problem ...

11 years ago (2009-12-10 04:26:37 UTC) #2

John Grabowski

>> I haven't tested it at all on a branded build Since all your changes ...

11 years ago (2009-12-10 06:57:05 UTC) #3

viettrungluu

On 2009/12/10 06:57:05, John Grabowski wrote: > >> I haven't tested it at all on ...

11 years ago (2009-12-10 07:29:34 UTC) #4

viettrungluu

On 2009/12/10 07:29:34, viettrungluu wrote: > > http://codereview.chromium.org/484007/diff/1/2#newcode50 > > chrome/renderer/renderer_main.cc:50: DestructCrashReporter(); > > There ...

11 years ago (2009-12-10 07:34:42 UTC) #5

viettrungluu

New version's up. It's an awful cut-and-paste job, which should be fixed properly. Again, I ...

11 years ago (2009-12-10 07:55:47 UTC) #6

jeremy

(disclaimer: I may be misunderstanding the issue in which case I hope Mark, Avi or ...

11 years ago (2009-12-10 10:12:09 UTC) #7

TVL

drive by: are we gonna have this issue in the utility process also? (or any ...

11 years ago (2009-12-10 13:22:13 UTC) #8

Avi (use Gerrit)

On 2009/12/10 13:22:13, TVL wrote: > drive by: are we gonna have this issue in ...

11 years ago (2009-12-10 15:27:33 UTC) #9

Mark Mentovai

I haven't read this carefully, but I saw that there was concern over the macro ...

11 years ago (2009-12-10 15:33:05 UTC) #10

viettrungluu

On 2009/12/10 15:27:33, Avi wrote: > On 2009/12/10 13:22:13, TVL wrote: > > drive by: ...

11 years ago (2009-12-10 15:59:09 UTC) #11

nealsid

On Thu, Dec 10, 2009 at 7:27 AM, <avi@chromium.org> wrote: > On 2009/12/10 13:22:13, TVL ...

11 years ago (2009-12-10 16:20:01 UTC) #12

viettrungluu

On 2009/12/10 16:20:01, nealsid wrote: > Additionally, it seems to me that Chrome has a ...

11 years ago (2009-12-10 16:28:45 UTC) #13

jeremy

By definition we may kill renderers (misbehaved/hung processes, etc). This may also not be that ...

11 years ago (2009-12-10 16:29:36 UTC) #14

By definition we may kill renderers (misbehaved/hung processes, etc).  This
may also not be that rare of an event (e.g. a misbehaved plugin on a user's
machine).

This means that we can't safely assume we can run code at process
termination in the dying process.

If we could use a single bootstrap port for all Chrome processes that would
be ideal.  Otherwise a solution where the browser process takes
responsibility for closing any leaked ports would be best.

Best regards,
Jeremy

On Thu, Dec 10, 2009 at 6:19 PM, Neal Sidhwaney <nealsid@gmail.com> wrote:

>
>
> On Thu, Dec 10, 2009 at 7:27 AM, <avi@chromium.org> wrote:
>
>> On 2009/12/10 13:22:13, TVL wrote:
>>
>>> drive by: are we gonna have this issue in the utility process also?
>>>
>>
>> This is why I'm worried about this way of handling it. If Breakpad is
>> modifying
>> global system state, it should be up to Breakpad to ensure that no matter
>> how
>> the process terminates that things get cleaned up. Forcing clients to deal
>> with
>> the situation leads to either duplicated code (if everyone does it right)
>> or
>>
>
> The problem with a Framework taking over global signals handlers for the
> process in order to clean up is that I still can't guarantee they'll be
> called, because someone might override them later, and if I am overriding a
> signal handler the process has already set up I need to deal with messy
> signal handler chaining, which was a source of bugs in the previous Linux
> version.
>
> An atexit handler might work, but I'm not even sure if that gets called
> when unhandled signals terminate a process.
>
> Additionally, it seems to me that Chrome has a bunch of code it needs to
> execute on renderer shutdown, which it hasn't been.  So, given that the
> duplicated code you are mentioning for all clients to tear down Breakpad
> properly is something like this:
>
> BreakpadRelease(breakpadRef);
>
> I believe it is quite reasonable for clients of Breakpad to control when
> they initialize AND when they teardown Breakpad.  Especially given Chrome's
> unique process teardown model.
>
> Neal
>
>
>
>
>> unhandled problems (if not everyone does).
>>
>>
>> http://codereview.chromium.org/484007
>>
>
>

Mark Mentovai

Here's my take: We need to tolerate abnormal process toleration. This can be SIGTERM, or ...

11 years ago (2009-12-10 17:04:35 UTC) #15

viettrungluu

On 2009/12/10 15:33:05, Mark Mentovai wrote: > I haven't read this carefully, but I saw ...

11 years ago (2009-12-10 18:19:47 UTC) #16

Mark Mentovai

Trung wrote: > Done. Should we also change it around the instance of > DestructCrashReporter() ...

11 years ago (2009-12-10 18:51:17 UTC) #17

nealsid

To comment on Jeremy's idea of using one port - it's a good one and ...

11 years ago (2009-12-10 19:03:49 UTC) #18

viettrungluu

I think we all agree that the right thing to do is to fix Breakpad, ...

11 years ago (2009-12-10 22:05:41 UTC) #19

Mark Mentovai

In http://crbug.com/28547, jrg said: > Hmm. I tried your test and agree; we never clean ...

11 years ago (2009-12-10 22:15:47 UTC) #20

In http://crbug.com/28547, jrg said:

> Hmm.  I tried your test and agree; we never clean up.  For my test we die in
_exit as called from
> SuicideOnChannelErrorFilter::OnChannelError().  (I don't know if that means we
always kill renderers by
> triggering an error, such as by closing the fd, or if so if that is expected.)

Do we need to do anything to handle an _exit from there?

http://codereview.chromium.org/484007/diff/5002/7
File chrome/renderer/renderer_main.cc (right):

http://codereview.chromium.org/484007/diff/5002/7#newcode43
chrome/renderer/renderer_main.cc:43: // TODO(viettrungluu): Code taken from
browser_main.cc (with a bit of editing).
You must use an anonymous namespace (or mark everything static):

namespace {

http://codereview.chromium.org/484007/diff/5002/7#newcode46
chrome/renderer/renderer_main.cc:46: int g_shutdown_pipe_read_fd = -1;
This variable is not needed and should be removed.

(It was removed from the file where you copied this from, but then backed out
for unrelated reasons.  Eventually it'll be re-removed.)

http://codereview.chromium.org/484007/diff/5002/7#newcode48
chrome/renderer/renderer_main.cc:48: // Common code between SIG{HUP, INT,
TERM}Handler.
No.

Maybe just move the whole body of GracefulShutdownHandler into SIGTERMHandler. 
We don't handle any other signals in here.

http://codereview.chromium.org/484007/diff/5002/7#newcode90
chrome/renderer/renderer_main.cc:90: ShutdownDetector::ShutdownDetector(int
shutdown_fd)
You can write this inline in the class definition above.

http://codereview.chromium.org/484007/diff/5002/7#newcode92
chrome/renderer/renderer_main.cc:92: CHECK(shutdown_fd_ != -1);
CHECK_NE?

http://codereview.chromium.org/484007/diff/5002/7#newcode95
chrome/renderer/renderer_main.cc:95: void ShutdownDetector::ThreadMain() {
And this too.

http://codereview.chromium.org/484007/diff/5002/7#newcode124
chrome/renderer/renderer_main.cc:124: _exit(0);
I don't think we want 0.  We're always going down on a signal here, right? 
(Well, maybe not - you will need to check bytes_read before using signal.)  See
how I re-raise the signal and fall back to _exit using the signal number in
browser_main.cc and adapt that, but if bytes_read != sizeof(signal), just
_exit(1) (or something else nonzero).

jeremy

I realize you've tested this manually and I don't want to slow down this patch, ...

11 years ago (2009-12-10 22:17:55 UTC) #21

viettrungluu

On 2009/12/10 22:15:47, Mark Mentovai wrote: > > Hmm. I tried your test and agree; ...

11 years ago (2009-12-10 23:51:25 UTC) #22

viettrungluu

Thanks, Mark. I'm going to go test it again now.... http://codereview.chromium.org/484007/diff/5002/7 File chrome/renderer/renderer_main.cc (right): http://codereview.chromium.org/484007/diff/5002/7#newcode43 ...

11 years ago (2009-12-10 23:51:49 UTC) #23

Thanks, Mark.

I'm going to go test it again now....

http://codereview.chromium.org/484007/diff/5002/7
File chrome/renderer/renderer_main.cc (right):

http://codereview.chromium.org/484007/diff/5002/7#newcode43
chrome/renderer/renderer_main.cc:43: // TODO(viettrungluu): Code taken from
browser_main.cc (with a bit of editing).
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> You must use an anonymous namespace (or mark everything static):
> 
> namespace {

Done.

http://codereview.chromium.org/484007/diff/5002/7#newcode46
chrome/renderer/renderer_main.cc:46: int g_shutdown_pipe_read_fd = -1;
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> This variable is not needed and should be removed.
> 
> (It was removed from the file where you copied this from, but then backed out
> for unrelated reasons.  Eventually it'll be re-removed.)

Done.

http://codereview.chromium.org/484007/diff/5002/7#newcode48
chrome/renderer/renderer_main.cc:48: // Common code between SIG{HUP, INT,
TERM}Handler.
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> No.
> 
> Maybe just move the whole body of GracefulShutdownHandler into SIGTERMHandler.

> We don't handle any other signals in here.

Done.

http://codereview.chromium.org/484007/diff/5002/7#newcode90
chrome/renderer/renderer_main.cc:90: ShutdownDetector::ShutdownDetector(int
shutdown_fd)
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> You can write this inline in the class definition above.

Done.

http://codereview.chromium.org/484007/diff/5002/7#newcode92
chrome/renderer/renderer_main.cc:92: CHECK(shutdown_fd_ != -1);
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> CHECK_NE?

No such thing, apparently.

http://codereview.chromium.org/484007/diff/5002/7#newcode95
chrome/renderer/renderer_main.cc:95: void ShutdownDetector::ThreadMain() {
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> And this too.

Done.

http://codereview.chromium.org/484007/diff/5002/7#newcode124
chrome/renderer/renderer_main.cc:124: _exit(0);
On 2009/12/10 22:15:47, Mark Mentovai wrote:
> I don't think we want 0.  We're always going down on a signal here, right? 
> (Well, maybe not - you will need to check bytes_read before using signal.) 
See
> how I re-raise the signal and fall back to _exit using the signal number in
> browser_main.cc and adapt that, but if bytes_read != sizeof(signal), just
> _exit(1) (or something else nonzero).

Done.

John Grabowski

If you cannot repro my stack, that's fine. The hope is that "normal" cases are ...

11 years ago (2009-12-11 00:01:50 UTC) #24

Mark Mentovai

LGTM pending positive test results and these minor changes. http://codereview.chromium.org/484007/diff/8001/8002 File chrome/renderer/renderer_main.cc (right): http://codereview.chromium.org/484007/diff/8001/8002#newcode73 chrome/renderer/renderer_main.cc:73: ...

11 years ago (2009-12-11 00:02:08 UTC) #25

John Grabowski

LGTM I agree with Jeremy's sentiment about a unit test but the expectation is that ...

11 years ago (2009-12-11 00:07:28 UTC) #26

viettrungluu

Thanks, Mark and John. Jeremy, I agree that testing would be good, but it's also ...

11 years ago (2009-12-11 00:44:39 UTC) #27

Thanks, Mark and John.

Jeremy, I agree that testing would be good, but it's also kind of tough in this
case (at least in a standard unit test). (To make things worse, checking Mach
ports directly requires permissions, IIRC.) Probably the easiest test would be
an automated version of the one now described in this CL's description....

http://codereview.chromium.org/484007/diff/8001/8002
File chrome/renderer/renderer_main.cc (right):

http://codereview.chromium.org/484007/diff/8001/8002#newcode62
chrome/renderer/renderer_main.cc:62: RAW_CHECK(g_shutdown_pipe_write_fd != -1);
On 2009/12/11 00:07:28, John Grabowski wrote:
> if (g_shutdown_pipe_write_fd == -1)  _exit(1);

After discussion (in person), we decided the RAW_CHECK() should be okay.

http://codereview.chromium.org/484007/diff/8001/8002#newcode73
chrome/renderer/renderer_main.cc:73: RAW_LOG(INFO,
On 2009/12/11 00:02:08, Mark Mentovai wrote:
> I guess I didn't read Will's string literal carefully enough in the other file
> when I reviewed that, but the text here is wrong in that the signal handler
> change comes before the pipe write.  Just revise the literal here.
> 
> For that matter, change all of the strings you're using here so that they
don't
> match what the other file uses.  Reason?  RAW_LOG just puts a string out to
> stderr.  There isn't any file or line number or anything else on the message. 
> If we ever need to use these messages for diagnosis, we'll need a way to
> distinguish them from the messages in the other file.

Done.

http://codereview.chromium.org/484007/diff/8001/8002#newcode102
chrome/renderer/renderer_main.cc:102: LOG(INFO) << "Handling shutdown for signal
" << signal << ".";
On 2009/12/11 00:02:08, Mark Mentovai wrote:
> |signal| may be garbage, don't use it without checking bytes_read.

Done.

http://codereview.chromium.org/484007/diff/8001/8002#newcode185
chrome/renderer/renderer_main.cc:185: // We need to handle SIGTERM when we're
using Breakpad, otherwise Mach ports
On 2009/12/11 00:02:08, Mark Mentovai wrote:
> Sentences are almost always clearer and more concise without "we."  This one
> would be better as:
> 
>   // When Breakpad is in use, handle SIGTERM to avoid leaking Mach ports.
> 
> You should put a bug reference either in this comment, or in the larger
comment
> up at the top inside your first #ifdef, or both.

Done.

TVL

is this signal hander what is causing problems for browser_tests in release?

11 years ago (2009-12-11 14:11:36 UTC) #28

viettrungluu

On 2009/12/11 14:28:43, TVL wrote: > or the tab switching test on the perf bots? ...

11 years ago (2009-12-11 16:26:43 UTC) #30

John Grabowski

On Fri, Dec 11, 2009 at 8:26 AM, <viettrungluu@chromium.org> wrote: > On 2009/12/11 14:28:43, TVL ...

11 years ago (2009-12-11 17:34:01 UTC) #31

Nico

Howdy folks, it looks like breakpad is initialized in chrome_dll_main.cc's ChromeMain() (which then calls RendererMain(), ...

10 years, 12 months ago (2009-12-26 02:31:49 UTC) #32

jeremy

Nico, you are correct. Although killing a renderer might be more common, exactly the same ...

10 years, 12 months ago (2009-12-27 08:39:31 UTC) #33

viettrungluu

10 years, 12 months ago (2009-12-28 19:00:26 UTC) #34

To elaborate on what Jeremy said:
- It is standard operating procedure to kill renderers (instead of 
telling them to quit), so that is an extremely common case.
- I've seen code which will kill a plugin process, but I don't know if 
it's used on Mac (at all; in any case, killing a plugin process doesn't 
seem to be a common case).
- I don't know about other process types, but I don't think killing any 
of them is a common case.
- Of course, killing these processes should not result in any leakage. 
The correct solution is to fix Breakpad so that no cleanup is required, 
except perhaps in the browser process. (Neal is working on this.)

Jeremy Moskovich wrote:
> Nico, you are correct.
> 
> Although killing a renderer might be more common, exactly the same thing 
> can happen for all other process types.
> 
> 
> On Sat, Dec 26, 2009 at 4:31 AM, <thakis@chromium.org 
> <mailto:thakis@chromium.org>> wrote:
> 
>     Howdy folks,
> 
>     it looks like breakpad is initialized in chrome_dll_main.cc's
>     ChromeMain()
>     (which then calls RendererMain(), PluginMain(), UtilityMain(), etc,
>     depending on
>     the process type). Shouldn't this cleanup stuff be done for all
>     child process
>     types and not just for the renderers?
> 
>     Nico
> 
>     http://codereview.chromium.org/484007
> 
>

Expand Messages | Collapse Messages